In probability theory and theoretical computer science, McDiarmid's inequality (named after Colin McDiarmid [1]) is a concentration inequality which bounds the deviation between the sampled value and the expected value of certain functions when they are evaluated on independent random variables.
McDiarmid's inequality applies to functions that satisfy a bounded differences property, meaning that replacing a single argument to the function while leaving all other arguments unchanged cannot cause too large of a change in the value of the function.
satisfies the bounded differences property if substituting the value of the
, and as an immediate consequence, A stronger bound may be given when the arguments to the function are sampled from unbalanced distributions, such that resampling a single argument rarely causes a large change to the function value.
McDiarmid's Inequality (unbalanced)[3][4] — Let
drawn from a distribution where there is a particular value
, This may be used to characterize, for example, the value of a function on graphs when evaluated on sparse random graphs and hypergraphs, since in a sparse random graph, it is much more likely for any particular edge to be missing than to be present.
McDiarmid's inequality may be extended to the case where the function being analyzed does not strictly satisfy the bounded differences property, but large differences remain very rare.
McDiarmid's Inequality (Differences bounded with high probability)[5] — Let
, and as an immediate consequence, There exist stronger refinements to this analysis in some distribution-dependent scenarios,[6] such as those that arise in learning theory.
th centered conditional version of a function
McDiarmid's Inequality (Sub-Gaussian norm)[7][8] — Let
th centered conditional version of
denote the sub-Gaussian norm of a random variable.
, McDiarmid's Inequality (Sub-exponential norm)[8] — Let
th centered conditional version of
denote the sub-exponential norm of a random variable.
, Refinements to McDiarmid's inequality in the style of Bennett's inequality and Bernstein inequalities are made possible by defining a variance term for each function argument.
Let McDiarmid's Inequality (Bennett form)[4] — Let
be defined as at the beginning of this section.
, McDiarmid's Inequality (Bernstein form)[4] — Let
be defined as at the beginning of this section.
, The following proof of McDiarmid's inequality[2] constructs the Doob martingale tracking the conditional expected value of the function as more and more of its arguments are sampled and conditioned on, and then applies a martingale concentration inequality (Azuma's inequality).
An alternate argument avoiding the use of martingales also exists, taking advantage of the independence of the function arguments to provide a Chernoff-bound-like argument.
[4] For better readability, we will introduce a notational shorthand:
is bounded, define the Doob martingale
Now define the random variables for each
does not affect the probabilities of the other variables, so these are equal to the expressions Note that
In addition, Then, applying the general form of Azuma's inequality to
, we have The one-sided bound in the other direction is obtained by applying Azuma's inequality to