Stein's method

It was introduced by Charles Stein, who first published it in 1972,[1] to obtain a bound between the distribution of a sum of

-dependent sequence of random variables and a standard normal distribution in the Kolmogorov (uniform) metric and hence to prove not only a central limit theorem, but also bounds on the rates of convergence for the given metric.

At the end of the 1960s, unsatisfied with the by-then known proofs of a specific central limit theorem, Charles Stein developed a new way of proving the theorem for his statistics lecture.

[2] His seminal paper was presented in 1970 at the sixth Berkeley Symposium and published in the corresponding proceedings.

[1] Later, his Ph.D. student Louis Chen Hsiao Yun modified the method so as to obtain approximation results for the Poisson distribution;[3] therefore the Stein method applied to the problem of Poisson approximation is often referred to as the Stein–Chen method.

Probably the most important contributions are the monograph by Stein (1986), where he presents his view of the method and the concept of auxiliary randomisation, in particular using exchangeable pairs, and the articles by Barbour (1988) and Götze (1991), who introduced the so-called generator interpretation, which made it possible to easily adapt the method to many other probability distributions.

An important contribution was also an article by Bolthausen (1984) on the so-called combinatorial central limit theorem.

The method gained further popularity in the machine learning community in the mid 2010s, following the development of computable Stein discrepancies and the diverse applications and algorithms based on them.

Important examples are the total variation metric, where we let

consist of all the indicator functions of measurable sets, the Kolmogorov (uniform) metric for probability measures on the real numbers, where we consider all the half-line indicator functions, and the Lipschitz (first order Wasserstein; Kantorovich) metric, where the underlying space is itself a metric space and we take the set

is the standard normal distribution, which serves as a classical example.

For the standard normal distribution, Stein's lemma yields such an operator: Thus, we can take There are in general infinitely many such operators and it still remains an open question, which one to choose.

There are different ways to find Stein operators.

is the standard normal distribution and we use (2.3), then the corresponding Stein equation is If probability distribution Q has an absolutely continuous (with respect to the Lebesgue measure) density q, then[4] Analytic methods.

Equation (3.3) can be easily solved explicitly: Generator method.

However, one still has to prove that the solution (4.2) exists for all desired functions

In the case of (4.1) one can prove for the supremum norm that where the last bound is of course only applicable if

If we have bounds in the general form (5.1), we usually are able to treat many probability metrics together.

One can often start with the next step below, if bounds of the form (5.1) are already available (which is the case for many distributions).

We are now in a position to bound the left hand side of (3.1).

As this step heavily depends on the form of the Stein operator, we directly regard the case of the standard normal distribution.

At this point we could directly plug in random variable

Using Taylor expansion, it is possible to prove that Note that, if we follow this line of argument, we can bound (1.1) only for functions where

Recall that the Lipschitz metric is of the form (1.1) where the functions

with local dependence structure and a standard normal distribution, we only need to know the third moments of

We can treat the case of sums of independent and identically distributed random variables with Theorem A.

From Theorem A we obtain that For sums of random variables another approach related to Steins Method is known as the zero bias transform.

The following text is advanced, and gives a comprehensive overview of the normal case Another advanced book, but having some introductory character, is A standard reference is the book by Stein, which contains a lot of interesting material, but may be a little hard to understand at first reading.

Despite its age, there are few standard introductory books about Stein's method available.

The following recent textbook has a chapter (Chapter 2) devoted to introducing Stein's method: Although the book is by large parts about Poisson approximation, it contains nevertheless a lot of information about the generator approach, in particular in the context of Poisson process approximation.