Box–Muller transform

The Box–Muller transform, by George Edward Pelham Box and Mervin Edgar Muller,[1] is a random number sampling method for generating pairs of independent, standard, normally distributed (zero expectation, unit variance) random numbers, given a source of uniformly distributed random numbers.

The method was first mentioned explicitly by Raymond E. A. C. Paley and Norbert Wiener in their 1934 treatise on Fourier transforms in the complex domain.

[2] Given the status of these latter authors and the widespread availability and use of their treatise, it is almost certain that Box and Muller were well aware of its contents.

The basic form as given by Box and Muller takes two samples from the uniform distribution on the interval (0,1) and maps them to two standard, normally distributed samples.

The polar form takes two samples from a different interval, [−1,+1], and maps them to two normally distributed samples without the use of sine or cosine functions.

[3] The ziggurat algorithm gives a more efficient method for scalar processors (e.g. old CPUs), while the Box–Muller transform is superior for processors with vector units (e.g. GPUs or modern CPUs).

[4] Suppose U1 and U2 are independent samples chosen from the uniform distribution on the unit interval (0, 1).

Then Z0 and Z1 are independent random variables with a standard normal distribution.

The derivation[5] is based on a property of a two-dimensional Cartesian system, where X and Y coordinates are described by two independent and normally distributed random variables, the random variables for R2 and Θ (shown above) in the corresponding polar coordinates are also independent and can be expressed as

Because R2 is the square of the norm of the standard bivariate normal variable (X, Y), it has the chi-squared distribution with two degrees of freedom.

In the special case of two degrees of freedom, the chi-squared distribution coincides with the exponential distribution, and the equation for R2 above is a simple way of generating the required exponential variate.

[7] While several different versions of the polar method have been described, the version of R. Knop will be described here because it is the most widely used, in part due to its inclusion in Numerical Recipes.

A slightly different form is described as "Algorithm P" by D. Knuth in The Art of Computer Programming.

[8] Given u and v, independent and uniformly distributed in the closed interval [−1, +1], set s = R2 = u2 + v2.

The latter can be seen by calculating the cumulative distribution function for s in the interval (0, 1).

From this we find the probability density function to have the constant value 1 on the interval (0, 1).

is uniformly distributed in the interval [0, 1) and independent of s. We now identify the value of s with that of U1 and

The advantage is that calculating the trigonometric functions directly can be avoided.

This is helpful when trigonometric functions are more expensive to compute than the single division that replaces each one.

Just as the basic form produces two standard normal deviates, so does this alternate calculation.

It discards some generated random numbers, but can be faster than the basic method because it is simpler to compute (provided that the random number generator is relatively fast) and is more numerically robust.

[9] Avoiding the use of expensive trigonometric functions improves speed over the basic form.

The basic form requires two multiplications, 1/2 logarithm, 1/2 square root, and one trigonometric function for each normal variate.

[10] On some processors, the cosine and sine of the same argument can be calculated in parallel using a single instruction.

Notably for Intel-based machines, one can use the fsincos assembler instruction or the expi instruction (usually available from C as an intrinsic function), to calculate complex

The polar form requires 3/2 multiplications, 1/2 logarithm, 1/2 square root, and 1/2 division for each normal variate.

The effect is to replace one multiplication and one trigonometric function with a single division and a conditional loop.

When a computer is used to produce a uniform random variable it will inevitably have some inaccuracies because there is a lower bound on how close numbers can be to 0.

The implementation below in standard C++ generates values from any normal distribution with mean

The random number generator has been seeded to ensure that new, pseudo-random values will be returned from sequential calls to the generateGaussianNoise function.

Visualisation of the Box–Muller transform — the coloured points in the unit square (u 1 , u 2 ), drawn as circles, are mapped to a 2D Gaussian (z 0 , z 1 ), drawn as crosses. The plots at the margins are the probability distribution functions of z0 and z1. z0 and z1 are unbounded; they appear to be in [−2.5, 2.5] due to the choice of the illustrated points. In the SVG file , hover over a point to highlight it and its corresponding point.
Two uniformly distributed values, u and v are used to produce the value s = R 2 , which is likewise uniformly distributed. The definitions of the sine and cosine are then applied to the basic form of the Box–Muller transform to avoid using trigonometric functions.