Chinese restaurant process

The results of this process are exchangeable, meaning the order in which the customers sit does not affect the probability of the final distribution.

This property greatly simplifies a number of problems in population genetics, linguistic analysis, and image recognition.

The restaurant analogy first appeared in a 1985 write-up by David Aldous,[1] where it was attributed to Jim Pitman (who additionally credits Lester Dubins).

In comparison with Hoppe's urn model, the Chinese restaurant process has the advantage that it naturally lends itself to describing random permutations via their cycle structure, in addition to describing random partitions.

The Chinese restaurant process takes values in the infinite Cartesian product

The probability assigned to any particular partition (ignoring the order in which customers sit around any particular table) is where

and correspondingly modifies the probability of them sitting at a table of size

can be interpreted as the effective number of customers sitting at the first empty table.

An equivalent, but subtly different way to define the Chinese restaurant process, is to let new customers choose companions rather than tables.

The Chinese restaurant table distribution (CRT) is the probability distribution on the number of tables in the Chinese restaurant process.

independent Bernoulli random variables, each with a different parameter: The probability mass function of

,[2][7] commonly called the strength (or concentration) and discount parameters respectively.

is an upper bound on the number of blocks in the partition; see the subsection on the Dirichlet-categorical model below for more details.

, the partition probability can be rewritten in terms of the Gamma function as In the one-parameter case, where

, used in population genetics and the unified neutral theory of biodiversity.

Notice that for this parameter setting, the probability of occupying a new table, when there are already

If we choose to identify tables with labels that take values in

, the hierarchical model first draws a categorical label distribution,

Since the Dirichlet distribution is conjugate to the categorical, the hidden variable

can be marginalized out to obtain the posterior predictive distribution for the next label state,

unoccupied tables, also agrees with the general formula and is given by The marginal probability for the labels is given by where

In general, there are however multiple label states that all correspond to the same partition.

blocks, the number of label states that all correspond to this partition is given by the falling factorial,

(Practical implementations that evaluate the log probability for partitions via

This shows that the Dirichlet-categorical model can be made arbitrarily close to

The table labels are drawn independently from the infinite categorical distribution

-th break is sampled from the beta distribution: The categorical probabilities are: For the parameter settings

This strains the restaurant-tables analogy and so is instead likened to a process in which a series of diners samples from some subset of an infinite selection of dishes on offer at a buffet.

This has been named the Indian buffet process and can be used to infer latent features in data.

[11] The Chinese restaurant process is closely connected to Dirichlet processes and Pólya's urn scheme, and therefore useful in applications of Bayesian statistics including nonparametric Bayesian methods.

Animation of a Chinese restaurant process with scaling parameter

\theta =0.5,\ \alpha =0

. Tables are hidden once the customers of a table can not be displayed anymore; however, every table has infinitely many seats. (Recording of an interactive animation. ^{[

8

]} )