Reparameterization trick

The reparameterization trick (aka "reparameterization gradient estimator") is a technique used in statistical machine learning, particularly in variational inference, variational autoencoders, and stochastic optimization.

It allows for the efficient computation of gradients through random variables, enabling the optimization of parametric probability models using stochastic gradient descent, and the variance reduction of estimators.

be a random variable with distribution

Consider an objective function of the form:

Without the reparameterization trick, estimating the gradient

can be challenging, because the parameter appears in the random variable itself.

In more detail, we have to statistically estimate:

The REINFORCE estimator, widely used in reinforcement learning and especially policy gradient,[4] uses the following equality:

This allows the gradient to be estimated:

The REINFORCE estimator has high variance, and many methods were developed to reduce its variance.

[5] The reparameterization trick expresses

is a noise variable drawn from a fixed distribution

For some common distributions, the reparameterization trick takes specific forms: Normal distribution: For

[6] In general, any distribution that is differentiable with respect to its parameters can be reparameterized by inverting the multivariable CDF function, then apply the implicit method.

See [1] for an exposition and application to the Gamma Beta, Dirichlet, and von Mises distributions.

In Variational Autoencoders (VAEs), the VAE objective function, known as the Evidence Lower Bound (ELBO), is given by:

is the encoder (recognition model),

is the prior distribution over latent variables.

The gradient of ELBO with respect to

Express the sampling operation

are the outputs of the encoder network, and

denotes element-wise multiplication.

This allows us to estimate the gradient using Monte Carlo sampling:

This formulation enables backpropagation through the sampling process, allowing for end-to-end training of the VAE model using stochastic gradient descent or its variants.

More generally, the trick allows using stochastic gradient descent for variational inference.

Let the variational objective (ELBO) be of the form:

Using the reparameterization trick, we can estimate the gradient of this objective with respect to

The reparameterization trick has been applied to reduce the variance in dropout, a regularization technique in neural networks.

The original dropout can be reparameterized with Bernoulli distributions:

The reparameterization trick can be applied to all such cases, resulting in the variational dropout method.