Marginal distribution

In probability theory and statistics, the marginal distribution of a subset of a collection of random variables is the probability distribution of the variables contained in the subset.

This contrasts with a conditional distribution, which gives the probabilities contingent upon the values of the other variables.

These concepts are "marginal" because they can be found by summing values in a table along rows or columns, and writing the sum in the margins of the table.

The context here is that the theoretical studies being undertaken, or the data analysis being done, involves a wider set of random variables but that attention is being limited to a reduced number of those variables.

In many applications, an analysis may start with a given collection of random variables, then first extend the set by defining new ones (such as the sum of the original random variables) and finally reduce the number by placing interest in the marginal distribution of a subset (such as the sum).

Several different analyses may be done, each treating a different subset of variables as the marginal distribution.

This can be calculated by summing the joint probability distribution over all values of Y.

Naturally, the converse is also true: the marginal distribution can be obtained for Y by summing over the separate values of X.

A marginal probability can always be written as an expected value:

This follows from the definition of expected value (after applying the law of the unconscious statistician)

Given two continuous random variables X and Y whose joint distribution is known, then the marginal probability density function can be obtained by integrating the joint probability distribution, f, over Y, and vice versa.

Finding the marginal cumulative distribution function from the joint cumulative distribution function is easy.

Recall that: If X and Y jointly take values on [a, b] × [c, d] then If d is ∞, then this becomes a limit

[3] That is, Suppose there is data from a classroom of 200 students on the amount of time studied (X) and the percentage of correct answers (Y).

[4] Assuming that X and Y are discrete random variables, the joint distribution of X and Y can be described by listing all the possible values of p(xi,yj), as shown in Table.3.

The marginal distribution can be used to determine how many students scored 20 or below:

The conditional distribution can be used to determine the probability that a student that studied 60 minutes or more obtains a scored of 20 or below:

, meaning there is about a 11% probability of scoring 20 after having studied for at least 60 minutes.

Suppose that the probability that a pedestrian will be hit by a car, while crossing the road at a pedestrian crossing, without paying attention to the traffic light, is to be computed.

Let L (for traffic light) be a discrete random variable taking one value from {Red, Yellow, Green}.

Realistically, H will be dependent on L. That is, P(H = Hit) will take different values depending on whether L is red, yellow or green (and likewise for P(H = Not Hit)).

A person is, for example, far more likely to be hit by a car when trying to cross while the lights for perpendicular traffic are green than if they are red.

In other words, for any given possible pair of values for H and L, one must consider the joint probability distribution of H and L to find the probability of that pair of events occurring together if the pedestrian ignores the state of the light.

However, in trying to calculate the marginal probability P(H = Hit), what is being sought is the probability that H = Hit in the situation in which the particular value of L is unknown and in which the pedestrian ignores the state of the light.

Here is a table showing the conditional probabilities of being hit, depending on the state of the lights.

To find the joint probability distribution, more data is required.

Multiplying each column in the conditional distribution by the probability of that column occurring results in the joint probability distribution of H and L, given in the central 2×3 block of entries.

For multivariate distributions, formulae similar to those above apply with the symbols X and/or Y being interpreted as vectors.

[5] That means, If X1,X2,…,Xn are discrete random variables, then the marginal probability mass function should be

if X1,X2,…,Xn are continuous random variables, then the marginal probability density function should be