Pachinko allocation

Topic models are a suite of algorithms to uncover the hidden thematic structure of a collection of documents.

[2] While first described and implemented in the context of natural language processing, the algorithm may have applications in other fields such as bioinformatics.

The model is named for pachinko machines—a game popular in Japan, in which metal balls bounce down around a complex collection of pins until they land in various bins at the bottom.

[4] In 2007, McCallum and his colleagues proposed a nonparametric Bayesian prior for PAM based on a variant of the hierarchical Dirichlet process (HDP).

[2] The algorithm has been implemented in the MALLET software package published by McCallum's group at the University of Massachusetts Amherst.