Negative multinomial distribution

In probability theory and statistics, the negative multinomial distribution is a generalization of the negative binomial distribution (NB(x0, p)) to more than two outcomes.

[1] As with the univariate negative binomial distribution, if the parameter

is a positive integer, the negative multinomial distribution has an urn model interpretation.

Suppose we have an experiment that generates m+1≥2 possible outcomes, {X0,...,Xm}, each occurring with non-negative probabilities {p0,...,pm} respectively.

If sampling proceeded until n observations were made, then {X0,...,Xm} would have been multinomially distributed.

However, if the experiment is stopped once X0 reaches the predetermined value x0 (assuming x0 is a positive integer), then the distribution of the m-tuple {X1,...,Xm} is negative multinomial.

These variables are not multinomially distributed because their sum X1+...+Xm is not fixed, being a draw from a negative binomial distribution.

If m-dimensional x is partitioned as follows

{\displaystyle \mathbf {X} ={\begin{bmatrix}\mathbf {X} ^{(1)}\\\mathbf {X} ^{(2)}\end{bmatrix}}{\text{ with sizes }}{\begin{bmatrix}n\times 1\\(m-n)\times 1\end{bmatrix}}}

{\displaystyle {\boldsymbol {p}}={\begin{bmatrix}{\boldsymbol {p}}^{(1)}\\{\boldsymbol {p}}^{(2)}\end{bmatrix}}{\text{ with sizes }}{\begin{bmatrix}n\times 1\\(m-n)\times 1\end{bmatrix}}}

The marginal distribution of

That is the marginal distribution is also negative multinomial with the

removed and the remaining p's properly scaled so as to add to one.

The univariate marginal

is said to have a negative binomial distribution.

The conditional distribution of

Similarly and conversely, it is easy to see from the characteristic function that the negative multinomial is infinitely divisible.

then, if the random variables with subscripts i and j are dropped from the vector and replaced by their sum,

This aggregation property may be used to derive the marginal distribution of

The entries of the correlation matrix are

{\displaystyle \rho (X_{i},X_{j})={\frac {\operatorname {cov} (X_{i},X_{j})}{\sqrt {\operatorname {var} (X_{i})\operatorname {var} (X_{j})}}}={\sqrt {\frac {p_{i}p_{j}}{(p_{0}+p_{i})(p_{0}+p_{j})}}}.}

If we let the mean vector of the negative multinomial be

and covariance matrix

then it is easy to show through properties of determinants that

Substituting sample moments yields the method of moments estimates

Waller LA and Zelterman D. (1997).

Log-linear modeling with the negative multi- nomial distribution.

Johnson, Norman L.; Kotz, Samuel; Balakrishnan, N. (1997).

"Chapter 36: Negative Multinomial and Other Multinomial-Related Distributions".

Discrete Multivariate Distributions.