Difference between revisions of "CYK Parsing"

From Cohen Courses
Jump to navigationJump to search
m (Created page with 'This is a method page for CYK Parsing.')
 
 
(21 intermediate revisions by 2 users not shown)
Line 1: Line 1:
This is a method page for CYK Parsing.
+
This is a [[Category::Method]] page for CYK Parsing.
 +
 
 +
The Cocke–Younger–Kasami (CYK) algorithm is an algorithm for [[AddressesProblem::parsing]] PCFG. It is bottom up and makes use of dynamic programming.
 +
 
 +
==Description==
 +
 
 +
The input to the algorithm is a grammar <math>G</math> in Chomsky Normal Form, and a sentence <math>X</math>.
 +
The idea behind the parsing algorithm is to recursively build parses from bottom up, and mantaining a chart C[i, j] which contains productions that generates X[i] ... X[j].
 +
 
 +
== Pseudocode ==
 +
 
 +
<pre>
 +
for each i = 1 to n
 +
  for each production A -> a
 +
    if a == X[i]
 +
      add (A -> a, 0) to C[i,i]
 +
 
 +
for each L = 2 to n
 +
  for each i = 1 to n-L+1
 +
    for each j = 1 to L - 1
 +
      for each production A -> A1 A2
 +
        if A1 in C[i,i+j] and A2 in C[i+j+1,i+L]
 +
          add (A -> A1 A2, i+j) to C[i,i+L]
 +
</pre>
 +
 
 +
We can recover the parse trees by traversing through the chart, which contains the production rule used and the point where the subtrees are split. The probability of a parse tree is simply the product of the probability of each production rule that we have used.
 +
 
 +
== References ==
 +
 
 +
CKY parsing was briefly covered in [[Class_Meeting_for_10-710_10-11-2011 | Fall 2011 11-763 Class meeting]] on Natural Language Parsing.
 +
 
 +
[http://www.sciencedirect.com/science/article/pii/S001999586780007X Daniel H. Younger (1967). Recognition and parsing of context-free languages in time n3. Information and Control 10(2): 189–208.] One of the original authors of CYK algorithm
 +
 
 +
[http://en.wikipedia.org/wiki/Inside-outside_algorithm Inside outside algorithm] uses CKY parsing to calculate expected counts of production rules being used.
 +
 
 +
[http://www.sciencedirect.com/science/article/pii/088523089090022X The estimation of stochastic context-free grammars using the Inside-Outside algorithm]
 +
 
 +
== Relevant papers ==
 +
{{#ask: [[UsesMethod::CYK Parsing]]
 +
| ?AddressesProblem
 +
| ?UsesDataset
 +
}}

Latest revision as of 02:33, 2 November 2011

This is a Method page for CYK Parsing.

The Cocke–Younger–Kasami (CYK) algorithm is an algorithm for parsing PCFG. It is bottom up and makes use of dynamic programming.

Description

The input to the algorithm is a grammar in Chomsky Normal Form, and a sentence . The idea behind the parsing algorithm is to recursively build parses from bottom up, and mantaining a chart C[i, j] which contains productions that generates X[i] ... X[j].

Pseudocode

for each i = 1 to n
  for each production A -> a
    if a == X[i]
      add (A -> a, 0) to C[i,i]

for each L = 2 to n
  for each i = 1 to n-L+1
    for each j = 1 to L - 1
      for each production A -> A1 A2
        if A1 in C[i,i+j] and A2 in C[i+j+1,i+L]
          add (A -> A1 A2, i+j) to C[i,i+L]

We can recover the parse trees by traversing through the chart, which contains the production rule used and the point where the subtrees are split. The probability of a parse tree is simply the product of the probability of each production rule that we have used.

References

CKY parsing was briefly covered in Fall 2011 11-763 Class meeting on Natural Language Parsing.

Daniel H. Younger (1967). Recognition and parsing of context-free languages in time n3. Information and Control 10(2): 189–208. One of the original authors of CYK algorithm

Inside outside algorithm uses CKY parsing to calculate expected counts of production rules being used.

The estimation of stochastic context-free grammars using the Inside-Outside algorithm

Relevant papers