Difference between revisions of "Inside Outside algorithm"
m |
m |
||
Line 48: | Line 48: | ||
[http://www.cs.jhu.edu/~jason/465/iobasics.pdf Notes on Inside-Outside algorithm. Jason Eisner.] provides a good walkthrough and explanation on using the inside-outside algorithm with a PCFG | [http://www.cs.jhu.edu/~jason/465/iobasics.pdf Notes on Inside-Outside algorithm. Jason Eisner.] provides a good walkthrough and explanation on using the inside-outside algorithm with a PCFG | ||
+ | |||
+ | [http://domino.watson.ibm.com/library/cyberdig.nsf/papers/C13E80CEF63EB770852576B6004852B4/$File/rc21636.pdf A Derivation of the Inside-Outside Algorithm from the EM Algorithm. John D Lafferty (2000).] shows the the inside-outside algorithm as a special case of EM. |
Revision as of 12:18, 29 November 2011
This is a Method page for the Inside-outside algorithm.
Contents
Background
The inside-outside algorithm is a way of estimating probabilities in a PCFG. It is first introduced [| Baker, 1979]. The inside outside algorithm is in fact a generalization of the forward-backward algorithm (for hidden Markov models) to PCFGs.
Algorithm
The algorithm is a dynamic programming algorithm that is often used with chart parsers to estimate expected production counts. Here, we assume the grammar is of Chomsky Normal Form.
The algorithm works by computing 2 probabilities for each nonterminal and span .
Inside probabilities
The inside probability is defined as , which is the probability of a nonterminal generating the word sequence to .
The inside probability can be calculated recursively with the following recurrence relation:
Intuitively, this can be seen as computing the sum over all possible ways of building trees rooted by and generating the word span .
For the base case, it is simply .
Outside counts
The outside probability is defined as , which is the probability of generating a parse tree spanning the entire sentence that uses nonterminal to span .
The reccurrence relation is thus:
The first term is basically considering all ways of generating trees where is used as a right subtree, and vis a vis for the second term.
Dynamic programming: Putting them together
In a standard EM framework, we would want to compute for each production rule, the expected number of times it is used for a given sentence, which we can compute by summing over the counts of using the production for all possible spans (and separation points)
Complexity
For a given sentence of length and grammar , the inside outside algorithm is
References
Trainable grammars for speech recognition. J Baker (1979). The original paper introducing the inside-outside algorithm
Notes on Inside-Outside algorithm. Jason Eisner. provides a good walkthrough and explanation on using the inside-outside algorithm with a PCFG
A Derivation of the Inside-Outside Algorithm from the EM Algorithm. John D Lafferty (2000). shows the the inside-outside algorithm as a special case of EM.