Difference between revisions of "Sutton McCullum ICML 2007: Piecewise pseudolikelihood for efficient CRF training"

From Cohen Courses
Jump to navigationJump to search
(Created page with '== Citation == {{MyCitejournal | coauthors = Andrew McCallum | date = 2007| first = Charles| journal = ICML| last = Sutton | title = Piecewise Pseudolikelihood for Efficient Tr…')
 
Line 27: Line 27:
 
where the second term is the standard gaussian prior to prevent over fitting.
 
where the second term is the standard gaussian prior to prevent over fitting.
  
== Evaluation ==  
+
== Experimental results ==  
  
 
* Sequences generated by a 2 order HMM.
 
* Sequences generated by a 2 order HMM.
  
 
* POS tagging on Penn Treebank set.
 
* POS tagging on Penn Treebank set.
 +
 +
== Related Papers ==
 +
 +
Pseudolikelihood was proposed by Besag (1975) and has been applied in NLP by Toutanova et al. (2003) and others.

Revision as of 01:40, 1 October 2011

Citation

Piecewise Pseudolikelihood for Efficient Training of Conditional Random Fields. By Charles Sutton, Andrew McCallum. In ICML, vol. {{{volume}}} ({{{issue}}}), 2007.

Online version

http://www.machinelearning.org/proceedings/icml2007/papers/549.pdf

Summary

Discriminative training of graphical models is expensive if the cardinality of the variables is large. Generally pseudo-likelihood reduces the cost of inference, but compromises on accuracy. Piecewise training although is accurate, is expensive in a similar way. The authors try to maximize the pseudo-likelihood on the piecewise model. If represent the maximum number of assignments to a single variable and represents the size of the largest

Definition of Piecewise Pseudo likelihood

For a single instance

where

Therefore the optimization function is

where the second term is the standard gaussian prior to prevent over fitting.

Experimental results

  • Sequences generated by a 2 order HMM.
  • POS tagging on Penn Treebank set.

Related Papers

Pseudolikelihood was proposed by Besag (1975) and has been applied in NLP by Toutanova et al. (2003) and others.