The nested sampling algorithm is a computational approach to the Bayesian statistics problems of comparing models and generating samples from posterior distributions.
It was developed in 2004 by physicist John Skilling.
[1] Bayes' theorem can be applied to a pair of competing models
may be calculated as: The prior probabilities
are already known, as they are chosen by the researcher ahead of time.
is not so easy to evaluate, since in general it requires marginalizing nuisance parameters.
has a set of parameters that can be grouped together and called
has its own vector of parameters that may be of different dimensionality, but is still termed
This integral is often analytically intractable, and in these cases it is necessary to employ a numerical algorithm to find an approximation.
The nested sampling algorithm was developed by John Skilling specifically to approximate these marginalization integrals, and it has the added benefit of generating samples from the posterior distribution
[2] It is an alternative to methods from the Bayesian literature[3] such as bridge sampling and defensive importance sampling.
Here is a simple version of the nested sampling algorithm, followed by a description of how it computes the marginal probability density
is an estimate of the amount of prior mass covered by the hypervolume in parameter space of all points with likelihood greater than
is an estimate of the amount of prior mass that lies between two nested hypersurfaces
to numerically approximate the integral In the limit
, this estimator has a positive bias of order
This can be thought of as a Bayesian's way to numerically implement Lebesgue integration.
[5] The original procedure outlined by Skilling (given above in pseudocode) does not specify what specific Markov chain Monte Carlo algorithm should be used to choose new points with better likelihood.
Skilling's own code examples (such as one in Sivia and Skilling (2006),[6] available on Skilling's website) chooses a random existing point and selects a nearby point chosen by a random distance from the existing point; if the likelihood is better, then the point is accepted, else it is rejected and the process repeated.
Mukherjee et al. (2006)[7] found higher acceptance rates by selecting points randomly within an ellipsoid drawn around the existing points; this idea was refined into the MultiNest algorithm[8] which handles multimodal posteriors better by grouping points into likelihood contours and drawing an ellipsoid for each contour.
Example implementations demonstrating the nested sampling algorithm are publicly available for download, written in several programming languages.
Since nested sampling was proposed in 2004, it has been used in many aspects of the field of astronomy.
One paper suggested using nested sampling for cosmological model selection and object detection, as it "uniquely combines accuracy, general applicability and computational feasibility.
"[7] A refinement of the algorithm to handle multimodal posteriors has been suggested as a means to detect astronomical objects in extant datasets.
[10] Other applications of nested sampling are in the field of finite element updating where the algorithm is used to choose an optimal finite element model, and this was applied to structural dynamics.
[12] This sampling method has also been used in the field of materials modeling.
It can be used to learn the partition function from statistical mechanics and derive thermodynamic properties.
[13] Dynamic nested sampling is a generalisation of the nested sampling algorithm in which the number of samples taken in different regions of the parameter space is dynamically adjusted to maximise calculation accuracy.
[14] This can lead to large improvements in accuracy and computational efficiency when compared to the original nested sampling algorithm, in which the allocation of samples cannot be changed and often many samples are taken in regions which have little effect on calculation accuracy.
Publicly available dynamic nested sampling software packages include: Dynamic nested sampling has been applied to a variety of scientific problems, including analysis of gravitational waves,[17] mapping distances in space[18] and exoplanet detection.