Yule–Simon distribution

Simon originally called it the Yule distribution.

[1] The probability mass function (pmf) of the Yule–Simon (ρ) distribution is for integer

Equivalently the pmf can be written in terms of the rising factorial as where

[2] The probability mass function f has the property that for sufficiently large k we have This means that the tail of the Yule–Simon distribution is a realization of Zipf's law:

th most frequent word in a large collection of text, which according to Zipf's law is inversely proportional to a (typically small) power of

The Yule–Simon distribution arose originally as the limiting distribution of a particular model studied by Udny Yule in 1925 to analyze the growth in the number of species per genus in some higher taxa of biotic organisms.

Yule proved that when time goes to infinity, the limit distribution of the number of species in a genus selected uniformly at random has a specific form and exhibits a power-law behavior in its tail.

Thirty years later, the Nobel laureate Herbert A. Simon proposed a time-discrete preferential attachment model to describe the appearance of new words in a large piece of a text.

Interestingly enough, the limit distribution of the number of occurrences of each word, when the number of words diverges, coincides with that of the number of species belonging to the randomly chosen genus in the Yule model, for a specific choice of the parameters.

In the context of random graphs, the Barabási–Albert model also exhibits an asymptotic degree distribution that equals the Yule–Simon distribution in correspondence of a specific choice of the parameters and still presents power-law characteristics for more general choices of the parameters.

The same happens also for other preferential attachment random graph models.

The Yule–Simon pmf is then the following exponential-geometric compound distribution: The maximum likelihood estimator for the parameter

are the rate and shape parameters of the gamma distribution prior on

This algorithm is derived by Garcia[2] by directly optimizing the likelihood.

Additionally, they use the EM formulation to give 2 alternate derivations of the standard error of the estimator from the fixed point equation.

estimator is the standard error is the square root of the quantity of this estimate divided by N. The two-parameter generalization of the original Yule distribution replaces the beta function with an incomplete beta function.

The probability mass function of the generalized Yule–Simon(ρ, α) distribution is defined as with

the ordinary Yule–Simon(ρ) distribution is obtained as a special case.

The use of the incomplete beta function has the effect of introducing an exponential cutoff in the upper tail.

Plot of the Yule–Simon(1) distribution (red) and its asymptotic Zipf's law (blue)