# PCFGs

PCFGs are a generative model used in NLP for parsing modeling and syntax analysis. Using this model, data is generated in a branching process where each non-terminal state produces a sequence of states that are visited in a recursive manner. PCFGs are used in a variety of NLP tasks including: grammar checking, lexical learning and machine translation.

## Contents

## Short definition of CFGs

- Let be an alphabet of observable symbols, called
*terminals*. - Let be a set of variable symbols, called
*non-terminals*, including a special start symbols, . - The grammar also defines a set of derivation rules, , for transforming terminal symbols into sequences of terminals and non-terminals. The rules are of the form , where and .

## Description of PCFGs

PCFGs extend CFGs by defining a multinomial distribution over the set of derivation rules over each terminal symbol.

- Let be the set of derivation rules for the non-terminal (where is on the left-hand side of the rule).
- Let be the multinomial distribution defined over such that

## The Generative Process

The process begins with the special start symbol . At every consecutive step, if there are any non-terminal symbols left in the generated sequence, a derivation rule is sampled from the distribution of rules for that non-terminal and it is replaced by the the right-hand of the rule which contains some combination of terminal and non-terminal symbols.

The process ends when there are no more non-terminal symbols in the generated sequence.

This generates a sequence of terminals from . The probability of generating a sequence given a certain derivation tree is simply the product of the probabilities used in that tree.

## Decoding

Decoding PCFGs is done with variants of the Earley and CYK algorithms.

## Relation to HMMs

HMMs are a special type of PCFGs. We can rewrite an HMM as a PCFG as follows:

- Let , the set of non-terminals, be the set of states of the HMM, including start () and end () states.
- Let be the transition probability from state to state , and let be the emission probability of symbol in state .

We can now define the following derivation rules:

- , where , </math>h, h' \in H</math>
- , for </math>h \in H</math>