The cross-entropy (CE) method is a Monte Carlo method for importance sampling and optimization.
It is applicable to both combinatorial and continuous problems, with either a static or noisy objective.
The method approximates the optimal importance sampling estimator by repeating two phases:[1] Reuven Rubinstein developed the method in the context of rare-event simulation, where tiny probabilities must be estimated, for example in network reliability analysis, queueing models, or performance analysis of telecommunication systems.
The method has also been applied to the traveling salesman, quadratic assignment, DNA sequence alignment, max-cut and buffer allocation problems.
Consider the general problem of estimating the quantity
ℓ =
is some performance function and
is a member of some parametric family of distributions.
Using importance sampling this quantity can be estimated as
ℓ ^
is a random sample from
For positive
, the theoretically optimal importance sampling density (PDF) is given by
ℓ
This, however, depends on the unknown
ℓ
The CE method aims to approximate the optimal PDF by adaptively selecting members of the parametric family that are closest (in the Kullback–Leibler sense) to the optimal PDF
In several cases, the solution to step 3 can be found analytically.
Situations in which this occurs are The same CE algorithm can be used for optimization, rather than estimation.
Suppose the problem is to maximize some function
To apply CE, one considers first the associated stochastic problem of estimating
for a given level
, and parametric family
, for example the 1-dimensional Gaussian distribution, parameterized by its mean
and variance
, the goal is to find
This is done by solving the sample version (stochastic counterpart) of the KL divergence minimization problem, as in step 3 above.
It turns out that parameters that minimize the stochastic counterpart for this choice of target distribution and parametric family are the sample mean and sample variance corresponding to the elite samples, which are those samples that have objective function value
The worst of the elite samples is then used as the level parameter for the next iteration.
This yields the following randomized algorithm that happens to coincide with the so-called Estimation of Multivariate Normal Algorithm (EMNA), an estimation of distribution algorithm.