Inside–outside algorithm

For parsing algorithms in computer science, the inside–outside algorithm is a way of re-estimating production probabilities in a probabilistic context-free grammar.

It was introduced by James K. Baker in 1979 as a generalization of the forward–backward algorithm for parameter estimation on hidden Markov models to stochastic context-free grammars.

It is used to compute expectations, for example as part of the expectation–maximization algorithm (an unsupervised learning algorithm).

The inside probability

is the total probability of generating words

, given the root nonterminal

:[1] The outside probability

is the total probability of beginning with the start symbol

and generating the nonterminal

{\displaystyle N_{pq}^{j}}

and all the words outside

:[1] Base Case:

General case: Suppose there is a rule

in the grammar, then the probability of generating

starting with a subtree rooted at

The inside probability

is just the sum over all such possible rules:

Base Case:

α

{\displaystyle \alpha _{j}(1,n)={\begin{cases}1&{\mbox{if }}j=1\\0&{\mbox{otherwise}}\end{cases}}}

Here the start symbol is

General case: Suppose there is a rule

in the grammar that generates

Then the left contribution of that rule to the outside probability

α

α

Now suppose there is a rule

Then the right contribution of that rule to the outside probability

The outside probability

is the sum of the left and right contributions over all such rules: