Difference between revisions of "Sutton McCullum ICML 2007: Piecewise pseudolikelihood for efficient CRF training"

Revision as of 01:40, 1 October 2011

Citation

Piecewise Pseudolikelihood for Efficient Training of Conditional Random Fields. By Charles Sutton, Andrew McCallum. In ICML, vol. {{{volume}}} ({{{issue}}}), 2007.

Online version

http://www.machinelearning.org/proceedings/icml2007/papers/549.pdf

Summary

Discriminative training of graphical models is expensive if the cardinality of the variables is large. Generally pseudo-likelihood reduces the cost of inference, but compromises on accuracy. Piecewise training although is accurate, is expensive in a similar way. The authors try to maximize the pseudo-likelihood on the piecewise model. If $m$ represent the maximum number of assignments to a single variable $y_{s}$ and $K$ represents the size of the largest

Definition of Piecewise Pseudo likelihood

For a single instance ${\vec {x}},{\vec {y}}$

${\mathcal {L}}_{\mbox{PWPL}}(\Lambda ,{\vec {x}},{\vec {y}})=\sum _{a}\sum _{s\in a}\ln p_{\mbox{LCL}}(y_{s}|{\vec {y}}_{a-s},{\vec {x}},\lambda _{a})$

where

$p_{\mbox{LCL}}(y_{s}|{\vec {y}}_{a-s},{\vec {x}},\lambda _{a})={\frac {\Psi _{a}(y_{s}|{\vec {y}}_{a-s},{\vec {x}},\lambda _{a})}{Z({\vec {y}}_{a-s},{\vec {x}},\lambda _{a})}}$

Therefore the optimization function is

$O=\sum _{i}{\mathcal {L}}_{\mbox{PWPL}}(\Lambda ,{\vec {x}}^{(i)},{\vec {y}}^{(i)})-\sum _{a}{\frac {\lambda _{a}^{2}}{2\sigma ^{2}}}$

where the second term is the standard gaussian prior to prevent over fitting.

Experimental results

Sequences generated by a 2 order HMM.

POS tagging on Penn Treebank set.

Related Papers

Pseudolikelihood was proposed by Besag (1975) and has been applied in NLP by Toutanova et al. (2003) and others.

@@ Line 27: / Line 27: @@
 where the second term is the standard gaussian prior to prevent over fitting.
-== Evaluation ==
+== Experimental results ==
 * Sequences generated by a 2 order HMM.
 * POS tagging on Penn Treebank set.
+== Related Papers ==
+Pseudolikelihood was proposed by Besag (1975) and has been applied in NLP by Toutanova et al. (2003) and others.

Difference between revisions of "Sutton McCullum ICML 2007: Piecewise pseudolikelihood for efficient CRF training"

Revision as of 01:40, 1 October 2011

Contents

Citation

Online version

Summary

Definition of Piecewise Pseudo likelihood

Experimental results

Related Papers

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools