The Wald–Wolfowitz runs test (or simply runs test), named after statisticians Abraham Wald and Jacob Wolfowitz is a non-parametric statistical test that checks a randomness hypothesis for a two-valued data sequence.
More precisely, it can be used to test the hypothesis that the elements of the sequence are mutually independent.
A run of a sequence is a maximal non-empty segment of the sequence consisting of adjacent equal elements.
For example, the 22-element-long sequence consists of 6 runs, with lengths 4, 3, 3, 2, 6, and 4.
The run test is based on the null hypothesis that each element in the sequence is independently drawn from the same distribution.
Under the null hypothesis, the number of runs in a sequence of N elements[note 1] is a random variable whose conditional distribution given the observation of N+ positive values[note 2] and N− negative values (N = N+ + N−) is approximately normal, with:[1][2] Equivalently, the number of runs is
These parameters do not assume that the positive and negative elements have equal probabilities of occurring, but only assume that the elements are independent and identically distributed.
If the number of runs is significantly higher or lower than expected, the hypothesis of statistical independence of the elements may be rejected.
The number of runs is
By independence, the expectation is
Writing out all possibilities, we find
{\displaystyle x_{1}x_{2}={\begin{cases}+1\quad &{\text{ with probability }}{\frac {N_{+}(N_{+}-1)+N_{-}(N_{-}-1)}{N(N-1)}}\\-1\quad &{\text{ with probability }}{\frac {2N_{+}N_{-}}{N(N-1)}}\\\end{cases}}}
Now simplify the expression to get
Similarly, the variance of the number of runs is
{\displaystyle Var[R]={\frac {1}{4}}Var[\sum _{i=1}^{N-1}x_{i}x_{i+1}]={\frac {1}{4}}((N-1)E[x_{1}x_{2}x_{1}x_{2}]+2(N-2)E[x_{1}x_{2}x_{2}x_{3}]+(N-2)(N-3)E[x_{1}x_{2}x_{3}x_{4}]-(N-1)^{2}E[x_{1}x_{2}]^{2})}
and simplifying, we obtain the variance.
Similarly we can calculate all moments of
, but the algebra becomes uglier and uglier.
If we sample longer and longer sequences, with
lim
for some fixed
− μ
σ
converges in distribution to the normal distribution with mean 0 and variance 1.
Proof sketch.
It suffices to prove the asymptotic normality of the sequence
, which can be proven by a martingale central limit theorem.
Runs tests can be used to test: The Kolmogorov–Smirnov test has been shown to be more powerful than the Wald–Wolfowitz test for detecting differences between distributions that differ solely in their location.
However, the reverse is true if the distributions differ in variance and have at the most only a small difference in location.
[citation needed] The Wald–Wolfowitz runs test has been extended for use with several samples.