Difference between revisions of "Inside Outside algorithm"

Revision as of 13:04, 29 November 2011

This is a Method page for the Inside-outside algorithm.

Background

The inside-outside algorithm is a way of estimating probabilities in a PCFG. It is first introduced [| Baker, 1979]. The inside outside algorithm is in fact a generalization of the forward-backward algorithm (for hidden Markov models) to PCFGs.

It is often used as part of the EM algorithm for computing expectations.

Algorithm

The algorithm is a dynamic programming algorithm that is often used with chart parsers to estimate expected production counts. Here, we assume the grammar $G$ is of Chomsky Normal Form.

The algorithm works by computing 2 probabilities for each nonterminal $A$ and span $i,j$ .

Inside probabilities

The inside probability is defined as $\alpha (A,i,j)=P(A{\overset {*}{\Rightarrow }}w_{i}...w_{j}|G,\mathbf {w} )$ , which is the probability of a nonterminal $A$ generating the word sequence $w_{i}$ to $w_{j}$ .

The inside probability can be calculated recursively with the following recurrence relation:

$\alpha (A,i,j)=\sum _{B,C}\sum _{i\leq k\leq j}p(A\rightarrow BC)\alpha (B,i,k)\alpha (C,k+1,j)$

Intuitively, this can be seen as computing the sum over all possible ways of building trees rooted by $A$ and generating the word span $i,j$ .

For the base case, it is simply $\alpha (A,i,i)=p(A\rightarrow w_{i})$ .

Outside counts

The outside probability is defined as $\beta (A,i,j)=P(S{\overset {*}{\Rightarrow }}w_{1},...,w_{i-1},A,w_{j+1},...,w_{n})$ , which is the probability of generating a parse tree spanning the entire sentence that uses nonterminal $A$ to span $i,j$ .

The reccurrence relation is thus:

$\beta (A,i,j)=\sum _{B,C}\sum _{1\leq k<i}p(B\rightarrow CA)\alpha (C,k,i-1)\beta (B,k,j)+\sum _{B,C}\sum _{j<k\leq n}p(B\rightarrow AC)\alpha (C,j+1,k)\beta (B,i,k)$

The first term is basically considering all ways of generating trees where $A$ is used as a right subtree, and vis a vis for the second term.

Dynamic programming: Putting them together

In a standard EM framework, we would want to compute for each production rule, the expected number of times it is used for a given sentence, which we can compute by

$count(A\rightarrow BC)=p(A\rightarrow BC)\sum _{1\leq i\leq j\leq k\leq n}\beta (A,i,k)\alpha (B,i,j)\alpha (C,j+1,k)$

@@ Line 33: / Line 33: @@
 <math>\beta(A,i,j)=\sum_{B,C}\sum_{1\leq k<i}p(B\rightarrow CA)\alpha(C,k,i-1)\beta(B,k,j) + \sum_{B,C}\sum_{j< k\leq n}p(B\rightarrow AC)\alpha(C,j+1,k)\beta(B,i,k)</math>
-The first term is the expected count of generating trees where <math>A</math> is used as a right subtree, and the second term is that of <math>A</math> being generated as a left subtree.
+The first term is basically considering all ways of generating trees where <math>A</math> is used as a right subtree, and vis a vis for the second term.
 === Dynamic programming: Putting them together ===
+In a standard EM framework, we would want to compute for each production rule, the expected number of times it is used for a given sentence, which we can compute by
+<math>count(A\rightarrow BC)=p(A\rightarrow BC)\sum_{1\leq i \leq j \leq k \leq n} \beta(A,i,k)\alpha(B,i,j)\alpha(C,j+1,k)</math>

Difference between revisions of "Inside Outside algorithm"

Revision as of 13:04, 29 November 2011

Contents

Background

Algorithm

Inside probabilities

Outside counts

Dynamic programming: Putting them together

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools