Berg-Kirkpatrick et al, ACL 2010: Painless Unsupervised Learning with Features

From Cohen Courses
Revision as of 16:50, 17 September 2011 by Yunwang (talk | contribs)
Jump to navigationJump to search

Citation

T. Berg-Kirkpatrick, A. Bouchard-Côté, J. DeNero, and D. Klein. Painless Unsupervised Learning with Features, Human Language Technologies 2010, pp. 582-590, Los Angeles, June 2010.

Online Version

PDF version

Summary

This paper generalizes conventional HMMs to featurized HMMs, by replacing the multinomial conditional probability distributions (CPDs) with miniature log-linear models. Two algorithms for unsupervised training of featurized HMMs are proposed.

Featurized HMMs are applied to four unsupervised learning tasks:

  • POS induction (unsupervised version of POS tagging);
  • Grammar induction;
  • Word alignment;
  • Word segmentation.

For all these four tasks, featurized HMMs are shown to outperform their unfeaturized counterparts by a substantial margin.

Featurized HMMs

Definition

The Estimation Problem

The Decoding Problem

The Training Problem

Experiments

POS Induction

Grammar Induction

Word Alignment

Word Segmentation