Philgoo Han writeup of Collins
From Cohen Courses
Jump to navigationJump to searchThis is a review of Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms, Collins, EMNLP 2002 by user:Ironfoot.
- Training HMMs with perceptron algorithm
- Probability structure as HMM
- Estimate function as maxent
- Update as perceptron
- Is there any reason two states from history are used, window size of 20 showed much less error Cohen(2005)
- Using average(or weighted average) performs beter than only using the last step parameter. Not intuitive but surprising
- Comparing results with other models(All the HMM below, CRF, voting-perceptron etc) will be interesting
- We have seen four variations of HMM so far
- HMM trained with joint likelihood
- HMM trained with conditional likelihood
- Maxent HMM - modification in structure
- HMM with perceptron alg.