Cramér–Rao bound

The result is named in honor of Harald Cramér and Calyampudi Radhakrishna Rao,[1][2][3] but has also been derived independently by Maurice Fréchet,[4] Georges Darmois,[5] and by Alexander Aitken and Harold Silverstone.

It states that the precision of any unbiased estimator is at most the Fisher information; or (equivalently) the reciprocal of the Fisher information is a lower bound on its variance.

An unbiased estimator that achieves this bound is said to be (fully) efficient.

Such a solution achieves the lowest possible mean squared error among all unbiased methods, and is, therefore, the minimum variance unbiased (MVU) estimator.

However, in some cases, no unbiased technique exists which achieves the bound.

This may occur either if for any unbiased estimator, there exists another with a strictly smaller variance, or if an MVU estimator exists, but its variance is strictly greater than the inverse of the Fisher information.

In some cases, a biased approach can result in both a variance and a mean squared error that are below the unbiased Cramér–Rao lower bound; see estimator bias.

Significant progress over the Cramér–Rao lower bound was proposed by Anil Kumar Bhattacharyya through a series of works, called Bhattacharyya bound.

[8][9][10][11] The Cramér–Rao bound is stated in this section for several increasingly general cases, beginning with the case in which the parameter is a scalar and its estimator is unbiased.

All versions of the bound require certain regularity conditions, which hold for most well-behaved distributions.

is the natural logarithm of the likelihood function for a single sample

denotes the expected value with respect to the density

is twice differentiable and certain regularity conditions hold, then the Fisher information can also be defined as follows:[13] The efficiency of an unbiased estimator

measures how close this estimator's variance comes to this lower bound; estimator efficiency is defined as or the minimum possible variance for an unbiased estimator divided by its actual variance.

Apart from being a bound on estimators of functions of the parameter, this approach can be used to derive a bound on the variance of biased estimators with a given bias, as follows.

satisfies[15] The unbiased version of the bound is a special case of this result, with

But from the above equation, we find that the mean squared error of a biased estimator is bounded by using the standard decomposition of the MSE.

), then the Cramér–Rao bound reduces to If it is inconvenient to compute the inverse of the Fisher information matrix, then one can simply take the reciprocal of the corresponding diagonal element to find a (possibly loose) lower bound.

[16] The bound relies on two weak regularity conditions on the probability density function,

Second equation: It suffices to prove this for scalar case, with

be a random variable with probability density function

as the score: where the chain rule is used in the final equality above.

This is because: where the integral and partial derivative have been interchanged (justified by the second regularity condition).

Expanding this expression we have again because the integration and differentiation operations commute (second condition).

For the case of a d-variate normal distribution the Fisher information matrix has elements[18] where "tr" is the trace.

Then the Fisher information is a scalar given by and so the Cramér–Rao bound is Suppose X is a normally distributed random variable with known mean

Thus, the information in a single observation is just minus the expectation of the derivative of

The Cramér–Rao bound states that In this case, the inequality is saturated (equality is achieved), showing that the estimator is efficient.

However, we can achieve a lower mean squared error using a biased estimator.

When the mean is not known, the minimum mean squared error estimate of the variance of a sample from Gaussian distribution is achieved by dividing by

Illustration of the Cramer-Rao bound: there is no unbiased estimator which is able to estimate the (2-dimensional) parameter with less variance than the Cramer-Rao bound, illustrated as standard deviation ellipse .