PCFGs are a generative model used in NLP for parsing modeling and syntax analysis. Using this model, data is generated in a branching process where each non-terminal state produces a sequence of states that are visited in a recursive manner. PCFGs are used in a variety of NLP tasks including: grammar checking, lexical learning and machine translation.
Short definition of CFGs
- Let be an alphabet of observable symbols, called terminals.
- Let be a set of variable symbols, called non-terminals, including a special start symbols, .
- The grammar also defines a set of derivation rules, , for transforming terminal symbols into sequences of terminals and non-terminals. The rules are of the form , where and .
Description of PCFGs
PCFGs extend CFGs by defining a multinomial distribution over the set of derivation rules over each terminal symbol.
- Let be the set of derivation rules for the non-terminal (where is on the left-hand side of the rule).
- Let be the multinomial distribution defined over such that
The Generative Process
The process begins with the special start symbol . At every consecutive step, if there are any non-terminal symbols left in the generated sequence, a derivation rule is sampled from the distribution of rules for that non-terminal and it is replaced by the the right-hand of the rule which contains some combination of terminal and non-terminal symbols.
The process ends when there are no more non-terminal symbols in the generated sequence.
This generates a sequence of terminals from . The probability of generating a sequence given a certain derivation tree is simply the product of the probabilities used in that tree.
Relation to HMMs
HMMs are a special type of PCFGs. We can rewrite an HMM as a PCFG as follows:
- Let , the set of non-terminals, be the set of states of the HMM, including start () and end () states.
- Let be the transition probability from state to state , and let be the emission probability of symbol in state .
We can now define the following derivation rules:
- , where , </math>h, h' \in H</math>
- , for </math>h \in H</math>