For example, we could use the negative binomial distribution to model the number of days n (random) a certain machine works (specified by r) before it breaks down.
A convention among engineers, climatologists, and others is to use "negative binomial" or "Pascal" for the case of an integer-valued stopping-time parameter (
Here, the quantity in parentheses is the binomial coefficient, and is equal to Note that Γ(r) is the Gamma function.
The above binomial coefficient, due to its combinatorial interpretation, gives precisely the number of all these sequences of length k + r − 1.
) [9][10][11] Each of the four definitions of the negative binomial distribution can be expressed in slightly different but equivalent ways.
Sometimes the distribution is parameterized in terms of its mean μ and variance σ2: Another popular parameterization uses r and the failure odds β: Hospital length of stay is an example of real-world data that can be modelled well with a negative binomial distribution via negative binomial regression.
[17][18] Pat Collis is required to sell candy bars to raise money for the 6th grade field trip.
Write down the number of trials performed in each experiment: a, b, c, ... and set a + b + c + ... = N. Now we would expect about Np successes in total.
A rigorous derivation can be done by representing the negative binomial distribution as the sum of waiting times.
When counting the number of successes before the r-th failure, as in alternative formulation (3) above, the variance is rp/(1 − p)2.
In this case, the binomial coefficient is defined when n is a real number, instead of just a positive integer.
Recall from above that This property persists when the definition is thus generalized, and affords a quick way to see that the negative binomial distribution is infinitely divisible.
For the cumulants Consider a sequence of negative binomial random variables where the stopping parameter r goes to infinity, while the probability p of success in each trial goes to one, in such a way as to keep the mean of the distribution (i.e. the expected number of failures) constant.
The Success count follows a Poisson distribution with mean pT, where T is the waiting time for r occurrences in a Poisson process of intensity 1 − p, i.e., T is gamma-distributed with shape parameter r and intensity 1 − p. Thus, the negative binomial distribution is equivalent to a Poisson distribution with mean pT, where the random variate T is gamma-distributed with shape parameter r and intensity (1 − p).
The preceding paragraph follows, because λ = pT is gamma-distributed with shape parameter r and intensity (1 − p)/p.
Suppose p is unknown and an experiment is conducted where it is decided ahead of time that sampling will continue until r successes are found.
In such cases, the observations are overdispersed with respect to a Poisson distribution, for which the mean is equal to the variance.
[25][26] Negative binomial modeling is widely employed in ecology and biodiversity research for analyzing count data where overdispersion is very common.
Ignoring overdispersion can lead to significantly inflated model parameters, resulting in misleading statistical inferences.
The negative binomial distribution effectively addresses overdispersed counts by permitting the variance to vary quadratically with the mean.
An additional dispersion parameter governs the slope of the quadratic term, determining the severity of overdispersion.
The model's quadratic mean-variance relationship proves to be a realistic approach for handling overdispersion, as supported by empirical evidence from many studies.
Overall, the NB model offers two attractive features: (1) the convenient interpretation of the dispersion parameter as an index of clustering or aggregation, and (2) its tractable form, featuring a closed expression for the probability mass function.
[32] The negative binomial distribution has been the most effective statistical model for a broad range of multiplicity observations in particle collision experiments, e.g.,
[33][34][35][36][37] (See [38] for an overview), and is argued to be a scale-invariant property of matter,[39][40] providing the best fit for astronomical observations, where it predicts the number of galaxies in a region of space.
[41][42][43][44] The phenomenological justification for the effectiveness of the negative binomial distribution in these contexts remained unknown for fifty years, since their first observation in 1973.
[45] In 2023, a proof from first principles was eventually demonstrated by Scott V. Tezlaf, where it was shown that the negative binomial distribution emerges from symmetries in the dynamical equations of a canonical ensemble of particles in Minkowski space.
, where an isomorphic set of equations can be identified with the parameters of a relativistic current density of a canonical ensemble of massive particles, via where
is the speed of light—such that one can establish the following bijective map: A rigorous alternative proof of the above correspondence has also been demonstrated through quantum mechanics via the Feynman path integral.
[46] This distribution was first studied in 1713 by Pierre Remond de Montmort in his Essay d'analyse sur les jeux de hazard, as the distribution of the number of trials required in an experiment to obtain a given number of successes.