Sutton McCullum ICML 2007: Piecewise pseudolikelihood for efficient CRF training

Citation

Piecewise Pseudolikelihood for Efficient Training of Conditional Random Fields. By Charles Sutton, Andrew McCallum. In ICML, vol. {{{volume}}} ({{{issue}}}), 2007.

Online version

This Paper is available here.

Summary

Discriminative training of graphical models is expensive if the cardinality of the variables is large. Generally pseudo-likelihood reduces the cost of inference, but compromises on accuracy. Piecewise training although is accurate, is expensive in a similar way. The authors try to maximize the pseudo-likelihood on the piecewise model.

Definition of Piecewise Pseudo likelihood

For a single instance ${\vec {x}},{\vec {y}}$ ,

${\mathcal {L}}_{\mbox{PWPL}}(\Lambda ,{\vec {x}},{\vec {y}})=\sum _{a}\sum _{s\in a}\ln p_{\mbox{LCL}}(y_{s}|{\vec {y}}_{a-s},{\vec {x}},\lambda _{a})$

where

$p_{\mbox{LCL}}(y_{s}|{\vec {y}}_{a-s},{\vec {x}},\lambda _{a})={\frac {\Psi _{a}(y_{s}|{\vec {y}}_{a-s},{\vec {x}},\lambda _{a})}{Z({\vec {y}}_{a-s},{\vec {x}},\lambda _{a})}}$

Therefore the optimization function is

$O=\sum _{i}{\mathcal {L}}_{\mbox{PWPL}}(\Lambda ,{\vec {x}}^{(i)},{\vec {y}}^{(i)})-\sum _{a}{\frac {\lambda _{a}^{2}}{2\sigma ^{2}}}$

where the second term is the standard gaussian prior to prevent over fitting.

Experimental results

Sequences generated by a 2 order HMM.

POS tagging on Penn Treebank set.

Sutton McCullum ICML 2007: Piecewise pseudolikelihood for efficient CRF training

Contents

Citation

Online version

Summary

Definition of Piecewise Pseudo likelihood

Experimental results

Related Papers

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools