Difference between revisions of "PCFGs"
(11 intermediate revisions by one other user not shown) | |||
Line 1: | Line 1: | ||
− | |||
− | |||
− | |||
PCFGs are a generative model used in NLP for parsing modeling and syntax analysis. Using this model, data is generated in a branching process where each non-terminal state produces a sequence of states that are visited in a recursive manner. | PCFGs are a generative model used in NLP for parsing modeling and syntax analysis. Using this model, data is generated in a branching process where each non-terminal state produces a sequence of states that are visited in a recursive manner. | ||
− | PCFGs are used in a variety of NLP tasks including: grammar checking, lexical learning and machine translation. | + | PCFGs are used in a variety of NLP tasks including: grammar checking, lexical learning and [[Machine_Translation | machine translation]]. |
== Short definition of CFGs == | == Short definition of CFGs == | ||
− | * Let <math>\Sigma</math> be an alphabet of observable symbols, called | + | * Let <math>\Sigma</math> be an alphabet of observable symbols, called ''terminals''. |
− | * Let <math>N</math> be a set of variable symbols, called | + | * Let <math>N</math> be a set of variable symbols, called ''non-terminals'', including a special start symbols, <math> S \in N</math>. |
* The grammar also defines a set of derivation rules, <math> R</math>, for transforming terminal symbols into sequences of terminals and non-terminals. The rules are of the form <math> X \rightarrow \alpha </math>, where <math> X \in N</math> and <math> \alpha \in \{N \cup \Sigma\}^*</math>. | * The grammar also defines a set of derivation rules, <math> R</math>, for transforming terminal symbols into sequences of terminals and non-terminals. The rules are of the form <math> X \rightarrow \alpha </math>, where <math> X \in N</math> and <math> \alpha \in \{N \cup \Sigma\}^*</math>. | ||
Line 19: | Line 16: | ||
== The Generative Process == | == The Generative Process == | ||
− | The process begins with the special start symbol <math>S \in N</math>. At every consecutive step, if there are any non-terminal symbols left in the generated sequence, a derivation rule is | + | The process begins with the special start symbol <math>S \in N</math>. At every consecutive step, if there are any non-terminal symbols left in the generated sequence, a derivation rule is sampled from the distribution of rules for that non-terminal and it is replaced by the the right-hand of the rule which contains some combination of terminal and non-terminal symbols. |
The process ends when there are no more non-terminal symbols in the generated sequence. | The process ends when there are no more non-terminal symbols in the generated sequence. | ||
Line 27: | Line 24: | ||
== Decoding == | == Decoding == | ||
− | Decoding PCFGs is done with variants of the | + | Decoding PCFGs is done with variants of the [http://en.wikipedia.org/wiki/Earley_algorithm Earley] and [http://en.wikipedia.org/wiki/CYK_algorithm CYK] algorithms. |
== Relation to HMMs == | == Relation to HMMs == | ||
− | + | [[HMM]]s are a special type of PCFGs. We can rewrite an HMM as a PCFG as follows: | |
* Let <math>N</math>, the set of non-terminals, be the set of states <math>H</math> of the HMM, including start (<math>S \in H</math>) and end (<math>E \in H</math>) states. | * Let <math>N</math>, the set of non-terminals, be the set of states <math>H</math> of the HMM, including start (<math>S \in H</math>) and end (<math>E \in H</math>) states. | ||
* Let <math>\pi_{h,h'}</math> be the transition probability from state <math>h</math> to state <math>h'</math>, and let <math>\tau_{h,x}</math> be the emission probability of symbol <math>x</math> in state <math>h</math>. | * Let <math>\pi_{h,h'}</math> be the transition probability from state <math>h</math> to state <math>h'</math>, and let <math>\tau_{h,x}</math> be the emission probability of symbol <math>x</math> in state <math>h</math>. | ||
Line 40: | Line 37: | ||
== Related Work == | == Related Work == | ||
+ | {{#ask: [[UsesMethod::PCFGs]] | ||
+ | | ?AddressesProblem | ||
+ | | ?UsesDataset | ||
+ | }} | ||
[[Category::method]] | [[Category::method]] |
Latest revision as of 15:20, 29 November 2011
PCFGs are a generative model used in NLP for parsing modeling and syntax analysis. Using this model, data is generated in a branching process where each non-terminal state produces a sequence of states that are visited in a recursive manner. PCFGs are used in a variety of NLP tasks including: grammar checking, lexical learning and machine translation.
Contents
Short definition of CFGs
- Let be an alphabet of observable symbols, called terminals.
- Let be a set of variable symbols, called non-terminals, including a special start symbols, .
- The grammar also defines a set of derivation rules, , for transforming terminal symbols into sequences of terminals and non-terminals. The rules are of the form , where and .
Description of PCFGs
PCFGs extend CFGs by defining a multinomial distribution over the set of derivation rules over each terminal symbol.
- Let be the set of derivation rules for the non-terminal (where is on the left-hand side of the rule).
- Let be the multinomial distribution defined over such that
The Generative Process
The process begins with the special start symbol . At every consecutive step, if there are any non-terminal symbols left in the generated sequence, a derivation rule is sampled from the distribution of rules for that non-terminal and it is replaced by the the right-hand of the rule which contains some combination of terminal and non-terminal symbols.
The process ends when there are no more non-terminal symbols in the generated sequence.
This generates a sequence of terminals from . The probability of generating a sequence given a certain derivation tree is simply the product of the probabilities used in that tree.
Decoding
Decoding PCFGs is done with variants of the Earley and CYK algorithms.
Relation to HMMs
HMMs are a special type of PCFGs. We can rewrite an HMM as a PCFG as follows:
- Let , the set of non-terminals, be the set of states of the HMM, including start () and end () states.
- Let be the transition probability from state to state , and let be the emission probability of symbol in state .
We can now define the following derivation rules:
- , where , </math>h, h' \in H</math>
- , for </math>h \in H</math>