Sutton McCullum ICML 2007: Piecewise pseudolikelihood for efficient CRF training
Contents
Citation
Piecewise Pseudolikelihood for Efficient Training of Conditional Random Fields. By Charles Sutton, Andrew McCallum. In ICML, vol. {{{volume}}} ({{{issue}}}), 2007.
Online version
http://www.machinelearning.org/proceedings/icml2007/papers/549.pdf
Summary
Discriminative training of graphical models is expensive if the cardinality of the variables is large. Generally pseudo-likelihood reduces the cost of inference, but compromises on accuracy. Piecewise training although is accurate, is expensive in a similar way. The authors try to maximize the pseudo-likelihood on the piecewise model. If represent the maximum number of assignments to a single variable and represents the size of the largest
Definition of Piecewise Pseudo likelihood
For a single instance
where
Therefore the optimization function is
where the second term is the standard gaussian prior to prevent over fitting.
Experimental results
- Sequences generated by a 2 order HMM.
- POS tagging on Penn Treebank set.
Related Papers
Pseudolikelihood was proposed by Besag (1975) and has been applied in NLP by Toutanova et al. (2003) and others.