Local outlier factor

In anomaly detection, the local outlier factor (LOF) is an algorithm proposed by Markus M. Breunig, Hans-Peter Kriegel, Raymond T. Ng and Jörg Sander in 2000 for finding anomalous data points by measuring the local deviation of a given data point with respect to its neighbours.

The "reachability distance" used by LOF has some subtle details that are often found incorrect in secondary sources, e.g., in the textbook of Ethem Alpaydin.

Objects that belong to the k nearest neighbors of B (the "core" of B, see DBSCAN cluster analysis) are considered to be equally distant.

While the geometric intuition of LOF is only applicable to low-dimensional vector spaces, the algorithm can be applied in any context a dissimilarity function can be defined.

It has experimentally been shown to work very well in numerous setups, often outperforming the competitors, for example in network intrusion detection[5] and on processed classification benchmark data.

[6] The LOF family of methods can be easily generalized and then applied to various other problems, such as detecting outliers in geographic data, video streams or authorship networks.

Basic idea of LOF: comparing the local density of a point with the densities of its neighbors. A has a much lower density than its neighbors.
Illustration of the reachability distance. Objects B and C have the same reachability distance ( k=3 ), while D is not a k nearest neighbor
LOF scores as visualized by ELKI . While the upper right cluster has a comparable density to the outliers close to the bottom left cluster, they are detected correctly.