Hodges–Lehmann estimator

[1] In the simplest case, the "Hodges–Lehmann" statistic estimates the location parameter for a univariate population.

(i.e. specifically including self-pairs; many secondary sources incorrectly omit this detail), which set has n(n + 1)/2 elements.

For each such subset, the mean is computed; finally, the median of these n(n + 1)/2 averages is defined to be the Hodges–Lehmann estimator of location.

The two-sample Hodges–Lehmann statistic is an estimate of a location-shift type difference between two populations.

[4] In the general case the Hodges-Lehmann statistic estimates the population's pseudomedian,[5] a location parameter that is closely related to the median.

The difference between the median and pseudo-median is relatively small, and so this distinction is neglected in elementary discussions.

This robustness is an important advantage over the sample mean, which has a zero breakdown point, being proportional to any single observation and so liable to being misled by even one outlier.

For the Cauchy distribution (Student t-distribution with one degree of freedom), the Hodges-Lehmann is infinitely more efficient than the sample mean, which is not a consistent estimator of the median,[8] but it is not more efficient than the median in that instance.

The one-sample Hodges–Lehmann statistic need not estimate any population mean, which for many distributions does not exist.