Class Meeting for 10-710 09-01-2011

This is one of the class meetings on the schedule for the course Structured Prediction 10-710 in Fall 2011.

LSP: chapter 1 (through 1.3) and section 3.3 (through 3.3.3); for some background on dynamic programming (buildup to Viterbi algorithm), see 2.3 (through 2.3.1, beyond if you're interested)
A Maximum Entropy Part-Of-Speech Tagger, Ratnaparkhi, Workshop on Very Large Corpora 1996
Mike Collins on learning in NLP, including a section on maxent taggers.
Dan Klein on maxent.
Bob Carpenter on maxent and SGD has detailed derivations of multiclass logistic regression and its gradients.

Contents