Hierarchical hidden Markov model

For example, even though a fully connected HMM could always be used if enough training data is available, it is often useful to constrain the model by not allowing arbitrary state transitions.

Only the production states emit observation symbols in the usual HMM sense.

Classical HHMMs require a pre-defined topology, meaning that the number and hierarchical structure of the submodels must be known in advance.

[1] Samko et al. (2010) used information about states from feature space (i. e., from outside the Markov Model itself) in order to define the topology for a new HHMM in an unsupervised way.

[2] However, such external data containing relevant information for HHMM construction may not be available in all contexts, e. g. in language processing.