Efficiency (statistics)

In statistics, efficiency is a measure of quality of an estimator, of an experimental design,[1] or of a hypothesis testing procedure.

An efficient estimator is characterized by having the smallest possible variance, indicating that there is a small deviance between the estimated value and the "true" value in the L2 norm sense.

The efficiency of an unbiased estimator, T, of a parameter θ is defined as [3] where

The notion of “best possible” relies upon the choice of a particular loss function — the function which quantifies the relative degree of undesirability of estimation errors of different magnitudes.

The most common choice of the loss function is quadratic, resulting in the mean squared error criterion of optimality.

This performance can be calculated by finding the mean squared error.

[5] For a more specific case, if T1 and T2 are two unbiased estimators for the same parameter θ, then the variance can be compared to determine performance.

This relationship can be determined by simplifying the more general case above for mean squared error; since the expected value of an unbiased estimator is equal to the parameter value,

[3] Equivalently, the estimator achieves equality in the Cramér–Rao inequality for all θ.

This is because an efficient estimator maintains equality on the Cramér–Rao inequality for all parameter values, which means it attains the minimum variance for all parameters (the definition of the MVUE).

The MVUE estimator, even if it exists, is not necessarily efficient, because "minimum" does not mean equality holds on the Cramér–Rao inequality.

is the Fisher information matrix of the model at point θ.

Generally, the variance measures the degree of dispersion of a random variable around its mean.

However the converse is false: There exist point-estimation problems for which the minimum-variance mean-unbiased estimator is inefficient.

[6] Historically, finite-sample efficiency was an early optimality criterion.

The data consists of n independent and identically distributed observations from this model: X = (x1, …, xn).

We estimate the parameter θ using the sample mean of all observations: This estimator has mean θ and variance of σ2 / n, which is equal to the reciprocal of the Fisher information from the sample.

, defined as The variance of the mean, 1/N (the square of the standard error) is equal to the reciprocal of the Fisher information from the sample and thus, by the Cramér–Rao inequality, the sample mean is efficient in the sense that its efficiency is unity (100%).

the sample median is approximately normally distributed with mean

An alternative to relative efficiency for comparing estimators, is the Pitman closeness criterion.

In this case efficiency can be defined as the square of the coefficient of variation, i.e.,[13] Relative efficiency of two such estimators can thus be interpreted as the relative sample size of one required to achieve the certainty of the other.

This is one of the motivations of robust statistics – an estimator such as the sample mean is an efficient estimator of the population mean of a normal distribution, for example, but can be an inefficient estimator of a mixture distribution of two normal distributions with the same mean and different variances.

Similarly, the shape of a distribution, such as skewness or heavy tails, can significantly reduce the efficiency of estimators that assume a symmetric distribution or thin tails.

M-estimators are a general class of estimators motivated by these concerns.

A more traditional alternative are L-estimators, which are very simple statistics that are easy to compute and interpret, in many cases robust, and often sufficiently efficient for initial estimates.

Efficiency in statistics is important because they allow one to compare the performance of various estimators.

Thus, estimator performance can be predicted easily by comparing their mean squared errors or variances.

For comparing significance tests, a meaningful measure of efficiency can be defined based on the sample size required for the test to achieve a given task power.

For experimental designs, efficiency relates to the ability of a design to achieve the objective of the study with minimal expenditure of resources such as time and money.

In simple cases, the relative efficiency of designs can be expressed as the ratio of the sample sizes required to achieve a given objective.