KeisukeKamataki writeup of Collins 2002

From Cohen Courses
Jump to navigationJump to search

This is a write up of Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms, Collins, EMNLP 2002 by user:KeisukeKamataki


  • Summary:

The author tried to extend voted(averaged, or just standard) perceptron so that it can handle tagging parameter estimation. The model training method itself is simple, for it is based on scoring function based on counts and Viterbi algorithm to estimate parameter vector alpha which is to define conditional probability distribution over tags given history. We can refine the model by taking average value with the parameters of additional training examples.

In theory, the key concept is that it ensures error bounding of training examples with theorems based on margin. For training this method guarantees the convergence of zero training error if there is a proper margin, also, this method is robust against a kind of separatable data according to the theorem of error bound. Test error can be also estimated with a theorem of similar way.

The algorithm worked better than ME when averaging perceptron, which improved prediction accuracy and made it stable than original method, was introduced.


  • I like:

The author gives us detailed justification of the algorithm. Although it is often difficult to tune parameters correctly according to data in discriminative model, it is still good that we can get theoretical understanding of error bounding with the algorithm.