It is fundamental to the concept of typical set used in theories of data compression.
Roughly speaking, the theorem states that although there are many series of results that may be produced by a random process, the one actually produced is most probably from a loosely defined set of outcomes that all have approximately the same chance of being the one actually realized.
(This is a consequence of the law of large numbers and ergodic theory.)
One way of intuitively understanding the property is through Cramér's large deviation theorem, which states that the probability of a large deviation from mean decays exponentially with the number of samples.
In the field of pseudorandom number generation, a candidate generator of undetermined quality whose output sequence lies too far outside the typical set by some statistical criteria is rejected as insufficiently random.
, which must exist for all discrete-time stationary processes including the ergodic ones.
sources directly using the law of large numbers in both the discrete-valued case (where
The definition of the asymptotic equipartition property can also be extended for certain classes of continuous-time stochastic processes for which a typical set exists for long enough observation time.
The weak law of large numbers gives the asymptotic equipartition property with convergence in probability,
The strong law of large numbers asserts the stronger almost sure convergence,
The Shannon–McMillan–Breiman theorem, due to Claude Shannon, Brockway McMillan, and Leo Breiman, states that we have convergence in the sense of L1.
may take value in a set of countable infinity, provided that the entropy rate is still finite.
The assumptions of stationarity/ergodicity/identical distribution of random variables is not essential for the asymptotic equipartition property to hold.
Indeed, as is quite clear intuitively, the asymptotic equipartition property requires only some form of the law of large numbers to hold, which is fairly general.
However, the expression needs to be suitably generalized, and the conditions need to be formulated precisely.
We assume that the source is producing independent symbols, with possibly different output statistics at each instant.
The proof follows from a simple application of Markov's inequality (applied to second moment of
is uniformly bounded for r > 1 (again by Markov's inequality applied to r-th moment).
Even this condition is not necessary, but given a non-stationary random process, it should not be difficult to test whether the asymptotic equipartition property holds using the above method.
The asymptotic equipartition property for non-stationary discrete-time independent process leads us to (among other results) the source coding theorem for non-stationary source (with independent output symbols) and noisy-channel coding theorem for non-stationary memoryless channels.
If such interpolation f is measurable, we may define the continuous-time stationary process accordingly as
If the asymptotic equipartition property holds for the discrete-time process, as in the i.i.d.
An important class of such continuous-time stationary process is the bandlimited stationary ergodic process with the sample space being a subset of the continuous
The asymptotic equipartition property holds if the process is white, in which case the time samples are i.i.d., or there exists T > 1/2W, where W is the nominal bandwidth, such that the T-spaced time samples take values in a finite set, in which case we have the discrete-time finite-valued stationary ergodic process.
Any time-invariant operations also preserves the asymptotic equipartition property, stationarity and ergodicity and we may easily turn a stationary process to non-stationary without losing the asymptotic equipartition property by nulling out a finite number of time samples in the process.
A category theoretic definition for the equipartition property is given by Gromov.
of a measure space P, this sequence admits an asymptotically equivalent sequence HN of homogeneous measure spaces (i.e. all sets have the same measure; all morphisms are invariant under the group of automorphisms, and thus factor as a morphism to the terminal object).
This is given in terms of a distance function, giving how much an injective correspondence differs from an isomorphism.
is commonly known as the earth mover's distance or Wasserstein metric.
Given a homogenous space sequence HN that is asymptotically equivalent to PN, the entropy H(P) of P may be taken as