Difference between revisions of "Inside Outside algorithm"

From Cohen Courses
Jump to navigationJump to search
m
m
Line 4: Line 4:
  
 
The inside-outside algorithm is a way of estimating probabilities in a PCFG. It is first introduced [[http://asadl.org/jasa/resource/1/jasman/v65/iS1/pS132_s1 | Baker, 1979]].  The inside outside algorithm is in fact a generalization of the forward-backward algorithm (for hidden Markov models) to PCFGs.
 
The inside-outside algorithm is a way of estimating probabilities in a PCFG. It is first introduced [[http://asadl.org/jasa/resource/1/jasman/v65/iS1/pS132_s1 | Baker, 1979]].  The inside outside algorithm is in fact a generalization of the forward-backward algorithm (for hidden Markov models) to PCFGs.
 
It is often used as part of the EM algorithm for computing expectations.
 
  
 
== Algorithm ==
 
== Algorithm ==
Line 40: Line 38:
  
 
<math>count(A\rightarrow BC)=\dfrac{p(A\rightarrow BC)}{\alpha(S,1,n)}\sum_{1\leq i \leq j \leq k \leq n} \beta(A,i,k)\alpha(B,i,j)\alpha(C,j+1,k)</math>
 
<math>count(A\rightarrow BC)=\dfrac{p(A\rightarrow BC)}{\alpha(S,1,n)}\sum_{1\leq i \leq j \leq k \leq n} \beta(A,i,k)\alpha(B,i,j)\alpha(C,j+1,k)</math>
 +
 +
== Time complexity ==
 +
 +
For a given sentence of length <math>n</math> and grammar <math>g</math>, the inside outside algorithm is <math>O(Gn^3)</math>

Revision as of 12:12, 29 November 2011

This is a Method page for the Inside-outside algorithm.

Background

The inside-outside algorithm is a way of estimating probabilities in a PCFG. It is first introduced [| Baker, 1979]. The inside outside algorithm is in fact a generalization of the forward-backward algorithm (for hidden Markov models) to PCFGs.

Algorithm

The algorithm is a dynamic programming algorithm that is often used with chart parsers to estimate expected production counts. Here, we assume the grammar is of Chomsky Normal Form.

The algorithm works by computing 2 probabilities for each nonterminal and span .

Inside probabilities

The inside probability is defined as , which is the probability of a nonterminal generating the word sequence to .

The inside probability can be calculated recursively with the following recurrence relation:

Intuitively, this can be seen as computing the sum over all possible ways of building trees rooted by and generating the word span .

For the base case, it is simply .

Outside counts

The outside probability is defined as , which is the probability of generating a parse tree spanning the entire sentence that uses nonterminal to span .

The reccurrence relation is thus:

The first term is basically considering all ways of generating trees where is used as a right subtree, and vis a vis for the second term.

Dynamic programming: Putting them together

In a standard EM framework, we would want to compute for each production rule, the expected number of times it is used for a given sentence, which we can compute by summing over the counts of using the production for all possible spans (and separation points)

Time complexity

For a given sentence of length and grammar , the inside outside algorithm is