Marginal likelihood

In Bayesian statistics, it represents the probability of generating the observed sample for all possible values of the parameters; it can be understood as the probability of the model itself and is therefore often referred to as model evidence or simply evidence.

If the focus is not on model comparison, the marginal likelihood is simply the normalizing constant that ensures that the posterior is a proper probability.

It is related to the partition function in statistical mechanics.

[1] Given a set of independent identically distributed data points

the marginal likelihood in general asks what the probability

has been marginalized out (integrated out): The above definition is phrased in the context of Bayesian statistics in which case

The marginal likelihood quantifies the agreement between data and prior in a geometric sense made precise[how?]

In classical (frequentist) statistics, the concept of marginal likelihood occurs instead in the context of a joint parameter

[dubious – discuss], it is often desirable to consider the likelihood function only in terms of

: Unfortunately, marginal likelihoods are generally difficult to compute.

Exact solutions are known for a small class of distributions, particularly when the marginalized-out parameter is the conjugate prior of the distribution of the data.

It is also possible to apply the above considerations to a single random variable (data point)

In a Bayesian context, this is equivalent to the prior predictive distribution of a data point.

In Bayesian model comparison, the marginalized variables

are parameters for a particular type of model, and the remaining variable

In this case, the marginalized likelihood is the probability of the data given the model type, not assuming any particular model parameters.

This quantity is important because the posterior odds ratio for a model M1 against another model M2 involves a ratio of marginal likelihoods, called the Bayes factor: which can be stated schematically as