The rule is often called Chebyshev's theorem, about the range of standard deviations around the mean, in statistics.
The inequality has great utility because it can be applied to any probability distribution in which the mean and variance are defined.
[3] The theorem is named after Russian mathematician Pafnuty Chebyshev, although it was first formulated by his friend and colleague Irénée-Jules Bienaymé.
[8] Chebyshev's inequality is usually stated for random variables, but can be generalized to a statement about measure spaces.
Then for any real number t > 0 and 0 < p < ∞, More generally, if g is an extended real-valued measurable function, nonnegative and nondecreasing, with
But if we additionally know that the distribution is normal, we can say there is a 75% chance the word count is between 770 and 1230 (which is an even tighter bound).
Markov's inequality states that for any real-valued random variable Y and any positive number a, we have
Chebyshev's inequality can also be obtained directly from a simple comparison of areas, starting from the representation of an expected value as the difference of two improper Riemann integrals (last formula in the definition of expected value for arbitrary real-valued random variables).
[13] Chebyshev's inequality naturally extends to the multivariate setting, where one has n random variables Xi with mean μi and variance σi2.
[14] This result can be rewritten in terms of vectors X = (X1, X2, ...) with mean μ = (μ1, μ2, ...), standard deviation σ = (σ1, σ2, ...), in the Euclidean norm || ⋅ ||.
The inequality can be written in terms of the Mahalanobis distance as where the Mahalanobis distance based on S is defined by Navarro[17] proved that these bounds are sharp, that is, they are the best possible bounds for that regions when we just know the mean and the covariance matrix of X. Stellato et al.[18] showed that this multivariate version of the Chebyshev inequality can be easily derived analytically as a special case of Vandenberghe et al.[19] where the bound is computed by solving a semidefinite program (SDP).
[25] Mitzenmacher and Upfal[26] note that by applying Markov's inequality to the nonnegative variable
For k ≥ 1, n > 4 and assuming that the nth moment exists, this bound is tighter than Chebyshev's inequality.
[citation needed] This strategy, called the method of moments, is often used to prove tail bounds.
A table of values for the Saw–Yang–Mo inequality for finite sample sizes (N < 100) has been determined by Konijn.
For example, Konijn shows that for N = 59, the 95 percent confidence interval for the mean m is (m − Cs, m + Cs) where C = 4.447 × 1.006 = 4.47 (this is 2.28 times larger than the value found on the assumption of normality showing the loss on precision resulting from ignorance of the precise nature of the distribution).
Chebyshev's inequality states that at most approximately 11.11% of the distribution will lie at least three standard deviations away from the mean.
Although Chebyshev's inequality is the best possible bound for an arbitrary distribution, this is not necessarily true for finite samples.
By comparison, Chebyshev's inequality states that all but a 1/N fraction of the sample will lie within √N standard deviations of the mean.
However, the benefit of Chebyshev's inequality is that it can be applied more generally to get confidence bounds for ranges of standard deviations that do not depend on the number of samples.
An alternative method of obtaining sharper bounds is through the use of semivariances (partial variances).
, this inequality corresponds to the one from Saw et al.[30] Moreover, the right-hand side can be simplified by upper bounding the floor function by its argument As
As a result of its generality it may not (and usually does not) provide as sharp a bound as alternative methods that can be used if the distribution of the random variable is known.
To improve the sharpness of the bounds provided by Chebyshev's inequality a number of methods have been developed; for a review see eg.
To express this in symbols let μ, ν, and σ be respectively the mean, the median, and the standard deviation.
Setting k = 1 in the statement for the one-sided inequality gives: Changing the sign of X and of μ, we get As the median is by definition any real number m that satisfies the inequalities this implies that the median lies within one standard deviation of the mean.
In 1823 Gauss showed that for a distribution with a unique mode at zero,[41] The Vysochanskij–Petunin inequality generalizes Gauss's inequality, which only holds for deviation from the mode of a unimodal distribution, to deviation from the mean, or more generally, any center.
Further, for symmetrical distributions, one-sided bounds can be obtained by noticing that The additional fraction of
present in these tail bounds lead to better confidence intervals than Chebyshev's inequality.
[46] Applying it to the square of a random variable, we get One use of Chebyshev's inequality in applications is to create confidence intervals for variates with an unknown distribution.