# CYK Parsing

This is a Method page for CYK Parsing.

The Cocke–Younger–Kasami (CYK) algorithm is an algorithm for parsing PCFG. It is bottom up and makes use of dynamic programming.

## Description

The input to the algorithm is a grammar ${\displaystyle G}$ in Chomsky Normal Form, and a sentence ${\displaystyle X}$. The idea behind the parsing algorithm is to recursively build parses from bottom up, and mantaining a chart C[i, j] which contains productions that generates X[i] ... X[j].

## Pseudocode

```for each i = 1 to n
for each production A -> a
if a == X[i]
add (A -> a, 0) to C[i,i]

for each L = 2 to n
for each i = 1 to n-L+1
for each j = 1 to L - 1
for each production A -> A1 A2
if A1 in C[i,i+j] and A2 in C[i+j+1,i+L]
add (A -> A1 A2, i+j) to C[i,i+L]
```

We can recover the parse trees by traversing through the chart, which contains the production rule used and the point where the subtrees are split. The probability of a parse tree is simply the product of the probability of each production rule that we have used.

## References

CKY parsing was briefly covered in Fall 2011 11-763 Class meeting on Natural Language Parsing.

Daniel H. Younger (1967). Recognition and parsing of context-free languages in time n3. Information and Control 10(2): 189–208. One of the original authors of CYK algorithm

Inside outside algorithm uses CKY parsing to calculate expected counts of production rules being used.