Predictive failure analysis

Predictive Failure Analysis was originally used as term for a proprietary IBM technology for monitoring the likelihood of hard disk drives to fail, although the term is now used generically for a variety of technologies for judging the imminent failure of CPU's, memory and I/O devices.

IBM introduced the term PFA and its technology in 1992 with reference to its 0662-S1x drive (1052 MB Fast-Wide SCSI-2 disk which operated at 5400 rpm).

The technology relies on measuring several key (mainly mechanical) parameters of the drive unit, for example the flying height of heads.

If the drive appears likely to fail soon, the system sends notification to the disk controller.

High counts of corrected RAM intermittent errors by ECC can be predictive of future DIMM failures [2] and so automatic offlining for memory and CPU caches can be used to avoid future errors,[3] for example under the Linux operating system the mcelog daemon will automatically remove from usage memory pages showing excessive corrections, and will remove from usage processor cores showing excessive cache correctable memory errors.