In the theory of probability and statistics, the Dvoretzky–Kiefer–Wolfowitz–Massart inequality (DKW inequality) provides a bound on the worst case distance of an empirically determined distribution function from its associated population distribution function.
It is named after Aryeh Dvoretzky, Jack Kiefer, and Jacob Wolfowitz, who in 1956 proved the inequality with an unspecified multiplicative constant C in front of the exponent on the right-hand side.
[1] In 1990, Pascal Massart proved the inequality with the sharp constant C = 2,[2] confirming a conjecture due to Birnbaum and McCarty.
[3] Given a natural number n, let X1, X2, …, Xn be real-valued independent and identically distributed random variables with cumulative distribution function F(·).
Let Fn denote the associated empirical distribution function defined by so
The Dvoretzky–Kiefer–Wolfowitz inequality bounds the probability that the random function Fn differs from F by more than a given constant ε > 0 anywhere on the real line.
The Dvoretzky–Kiefer–Wolfowitz inequality is obtained for the Kaplan–Meier estimator which is a right-censored data analog of the empirical distribution function for every
The DKW bounds runs parallel to, and is equally above and below, the empirical CDF.
The equally spaced confidence interval around the empirical CDF allows for different rates of violations across the support of the distribution.