Anderson–Darling test

However, the test is most often used in contexts where a family of distributions is being tested, in which case the parameters of that family need to be estimated and account must be taken of this in adjusting either the test-statistic or its critical values.

When applied to testing whether a normal distribution adequately describes a set of data, it is one of the most powerful statistical tools for detecting most departures from normality.

[1][2] K-sample Anderson–Darling tests are available for testing whether several collections of observations can be modelled as coming from a single population, where the distribution function does not have to be specified.

In addition to its use as a test of fit for distributions, it can be used in parameter estimation as the basis for a form of minimum distance estimation procedure.

The test is named after Theodore Wilbur Anderson (1918–2016) and Donald A.

[3] The Anderson–Darling and Cramér–von Mises statistics belong to the class of quadratic EDF statistics (tests based on the empirical distribution function).

, and empirical (sample) cumulative distribution function is

, then the quadratic EDF statistics measure the distance between

The Anderson–Darling (1954) test[4] is based on the distance which is obtained when the weight function is

The Anderson–Darling test assesses whether a sample comes from a specified distribution.

(note that the data must be put in order) comes from a CDF

is where The test statistic can then be compared against the critical values of the theoretical distribution.

In this case, no parameters are estimated in relation to the cumulative distribution function

Essentially the same test statistic can be used in the test of fit of a family of distributions, but then it must be compared against the critical values appropriate to that family of theoretical distributions and dependent also on the method used for parameter estimation.

to be one of the best empirical distribution function statistics for detecting most departures from normality.

The computation differs based on what is known about the distribution:[6] The n observations,

and the notation in the following assumes that Xi represent the ordered observations.

is calculated using An alternative expression in which only a single observation is dealt with at each step of the summation is: A modified statistic can be calculated using If

exceeds a given critical value, then the hypothesis of normality is rejected with some significance level.

Note 2: The above adjustment formula is taken from Shorack & Wellner (1986, p239).

Care is required in comparisons across different sources as often the specific adjustment formula is not stated.

Alternatively, for case 3 above (both mean and variance unknown), D'Agostino (1986) [6] in Table 4.7 on p. 123 and on pages 372–373 gives the adjusted statistic: and normality is rejected if

exceeds 0.631, 0.754, 0.884, 1.047, or 1.159 at 10%, 5%, 2.5%, 1%, and 0.5% significance levels, respectively; the procedure is valid for sample size at least n=8.

Any other family of distributions can be tested but the test for each family is implemented by using a different modification of the basic test statistic and this is referred to critical values specific to that family of distributions.

The modifications of the statistic and tables of critical values are given by Stephens (1986)[2] for the exponential, extreme-value, Weibull, gamma, logistic, Cauchy, and von Mises distributions.

Details for the required modifications to the test statistic and for the critical values for the normal distribution and the exponential distribution have been published by Pearson & Hartley (1972, Table 54).

A test for the (two parameter) Weibull distribution can be obtained by making use of the fact that the logarithm of a Weibull variate has a Gumbel distribution.

Fritz Scholz and Michael A. Stephens (1987) discuss a test, based on the Anderson–Darling measure of agreement between distributions, for whether a number of random samples with possibly different sample sizes may have arisen from the same distribution, where this distribution is unspecified.

[8] The R package kSamples and the Python package Scipy implements this rank test for comparing k samples among several other such rank tests.

samples the statistic can be computed as follows under the assumption that the distribution function