Previous versions of the theorem date back to 1811, but in its modern form it was only precisely stated as late as 1920.
The classical central limit theorem describes the size and the distributional form of the stochastic fluctuations around the deterministic number
The multidimensional central limit theorem states that when scaled, sums converge to a multivariate normal distribution.
[11] The Generalized Central Limit Theorem (GCLT) was an effort of multiple mathematicians (Bernstein, Lindeberg, Lévy, Feller, Kolmogorov, and others) over the period from 1920 to 1937.
[13] An English language version of the complete proof of the GCLT is available in the translation of Gnedenko and Kolmogorov's 1954 book.
Stein's method[21] can be used not only to prove the central limit theorem, but also to provide bounds on the rates of convergence for selected metrics.
[23] The central limit theorem applies in particular to sums of independent and identically distributed discrete random variables.
This means that if we build a histogram of the realizations of the sum of n independent identical discrete variables, the piecewise-linear curve that joins the centers of the upper faces of the rectangles forming the histogram converges toward a Gaussian curve as n approaches infinity; this relation is known as de Moivre–Laplace theorem.
The binomial distribution article details such an application of the central limit theorem in the simple case of a discrete variable taking only two possible values.
Studies have shown that the central limit theorem is subject to several common but serious misconceptions, some of which appear in widely used textbooks.
Informally, something along these lines happens when the sum, Sn, of independent identically distributed random variables, X1, ..., Xn, is studied in classical probability theory.
In the case where the Xi do not have finite mean or variance, convergence of the shifted and rescaled sum can also occur with different centering and scaling factors:
See Petrov[32] for a particular local limit theorem for sums of independent and identically distributed random variables.
Many physical quantities (especially mass or length, which are a matter of scale and cannot be negative) are the products of different random factors, so they follow a log-normal distribution.
Theorem (Salem–Zygmund) — Let U be a random variable distributed uniformly on (0,2π), and Xk = rk cos(nkU + ak), where Then[39][40]
Since real-world quantities are often the balanced sum of many unobserved random events, the central limit theorem also provides a partial explanation for the prevalence of the normal probability distribution.
Given its importance to statistics, a number of papers and computer packages are available that demonstrate the convergence involved in the central limit theorem.
[47] Dutch mathematician Henk Tijms writes:[48] The central limit theorem has an interesting history.
This finding was far ahead of its time, and was nearly forgotten until the famous French mathematician Pierre-Simon Laplace rescued it from obscurity in his monumental work Théorie analytique des probabilités, which was published in 1812.
It was not until the nineteenth century was at an end that the importance of the central limit theorem was discerned, when, in 1901, Russian mathematician Aleksandr Lyapunov defined it in general terms and proved precisely how it worked mathematically.
Nowadays, the central limit theorem is considered to be the unofficial sovereign of probability theory.Sir Francis Galton described the Central Limit Theorem in this way:[49] I know of scarcely anything so apt to impress the imagination as the wonderful form of cosmic order expressed by the "Law of Frequency of Error".
Whenever a large sample of chaotic elements are taken in hand and marshalled in the order of their magnitude, an unsuspected and most beautiful form of regularity proves to have been latent all along.The actual term "central limit theorem" (in German: "zentraler Grenzwertsatz") was first used by George Pólya in 1920 in the title of a paper.
According to Le Cam, the French school of probability interprets the word central in the sense that "it describes the behaviour of the centre of the distribution as opposed to its tails".
[51] The abstract of the paper On the central limit theorem of calculus of probability and the problem of moments by Pólya[50] in 1920 translates as follows.
The occurrence of the Gaussian probability density 1 = e−x2 in repeated experiments, in errors of measurements, which result in the combination of very many and very small elementary errors, in diffusion processes etc., can be explained, as is well-known, by the very same limit theorem, which plays a central role in the calculus of probability.
The actual discoverer of this limit theorem is to be named Laplace; it is likely that its rigorous proof was first given by Tschebyscheff and its sharpest formulation can be found, as far as I am aware of, in an article by Liapounoff.
...A thorough account of the theorem's history, detailing Laplace's foundational work, as well as Cauchy's, Bessel's and Poisson's contributions, is provided by Hald.
[52] Two historical accounts, one covering the development from Laplace to Cauchy, the second the contributions by von Mises, Pólya, Lindeberg, Lévy, and Cramér during the 1920s, are given by Hans Fischer.
[51] Bernstein[54] presents a historical discussion focusing on the work of Pafnuty Chebyshev and his students Andrey Markov and Aleksandr Lyapunov that led to the first proofs of the CLT in a general setting.
A curious footnote to the history of the Central Limit Theorem is that a proof of a result similar to the 1922 Lindeberg CLT was the subject of Alan Turing's 1934 Fellowship Dissertation for King's College at the University of Cambridge.