Unbiased estimation of standard deviation

Except in some important situations, outlined later, the task has little relevance to applications of statistics since its need is avoided by standard procedures, such as the use of significance tests and confidence intervals, or by using Bayesian analysis.

It also provides an example where imposing the requirement for unbiased estimation might be seen as just adding inconvenience, with no real benefit.

One way of seeing that this is a biased estimator of the standard deviation of the population is to start from the result that s2 is an unbiased estimator for the variance σ2 of the underlying population if that variance exists and the sample values are drawn independently with replacement.

The use of n − 1 instead of n in the formula for the sample variance is known as Bessel's correction, which corrects the bias in the estimation of the population variance, and some, but not all of the bias in the estimation of the population standard deviation.

When the random variable is normally distributed, a minor correction exists to eliminate the bias.

To derive the correction, note that for normally distributed X, Cochran's theorem implies that

Consequently, calculating the expectation of this last expression and rearranging constants, where the correction factor

grows large it approaches 1, and even for smaller values the correction is minor.

; more complete tables may be found in most textbooks[2][3] on statistical quality control.

It is important to keep in mind this correction only produces an unbiased estimator for normally and independently distributed X.

If calculation of the function c4(n) appears too difficult, there is a simple rule of thumb[6] to take the estimator The formula differs from the familiar expression for s2 only by having n − 1.5 instead of n − 1 in the denominator.

In cases where statistically independent data are modelled by a parametric family of distributions other than the normal distribution, the population standard deviation will, if it exists, be a function of the parameters of the model.

Alternatively, it may be possible to use the Rao–Blackwell theorem as a route to finding a good estimate of the standard deviation.

If the requirement is simply to reduce the bias of an estimated standard deviation, rather than to eliminate it entirely, then two practical approaches are available, both within the context of resampling.

For non-normal distributions an approximate (up to O(n−1) terms) formula for the unbiased estimator of the standard deviation is where γ2 denotes the population excess kurtosis.

However, real-world data often does not meet this requirement; it is autocorrelated (also known as serial correlation).

Estimates of the variance, and standard deviation, of autocorrelated data will be biased.

(Note that the expression in the brackets is simply one minus the average expected autocorrelation for the readings.)

If the ACF consists of positive values then the estimate of the variance (and its square root, the standard deviation) will be biased low.

That is, the actual variability of the data will be greater than that indicated by an uncorrected variance or standard deviation calculation.

It is essential to recognize that, if this expression is to be used to correct for the bias, by dividing the estimate

[8] To illustrate the magnitude of the bias in the standard deviation, consider a dataset that consists of sequential readings from an instrument that uses a specific digital filter whose ACF is known to be given by

The figure shows the ratio of the estimated standard deviation to its known value (which can be calculated analytically for this digital filter), for several settings of α as a function of sample size n. Changing α alters the variance reduction ratio of the filter, which is known to be so that smaller values of α result in more variance reduction, or “smoothing.” The bias is indicated by values on the vertical axis different from unity; that is, if there were no bias, the ratio of the estimated to known standard deviation would be unity.

First define the following constants, assuming, again, a known ACF: so that This says that the expected value of the quantity obtained by dividing the observed sample variance by the correction factor

are identically zero, this expression reduces to the well-known result for the variance of the mean for independent data.

The effect of the expectation operator in these expressions is that the equality holds in the mean (i.e., on average).

Having the expressions above involving the variance of the population, and of an estimate of the mean of that population, it would seem logical to simply take the square root of these expressions to obtain unbiased estimates of the respective standard deviations.

In the case of NID (normally and independently distributed) data, the radicand is unity and θ is just the c4 function given in the first section above.

The unbiased variance of the mean in terms of the population variance and the ACF is given by and since there are no expected values here, in this case the square root can be taken, so that Using the unbiased estimate expression above for σ, an estimate of the standard deviation of the mean will then be If the data are NID, so that the ACF vanishes, this reduces to In the presence of a nonzero ACF, ignoring the function θ as before leads to the reduced-bias estimator which again can be demonstrated to remove a useful majority of the bias.

This article incorporates public domain material from the National Institute of Standards and Technology