Difference between revisions of "Class Meeting for 10-710 09-01-2011"

Latest revision as of 00:17, 28 September 2011

This is one of the class meetings on the schedule for the course Structured Prediction 10-710 in Fall 2011.

An Algorithm that Learns What's in a Name, Bikel et al, MLJ 1999. (Wiki) Another well-engineered and influential HMM-based NER system.
Unsupervised Learning of Field Segmentation Models for Information Extraction, Grenager, Klein, and Manning, ACL 2005. Unsupervised segmentation paper.
Named Entity Recognition with Character-Level Models, Klein et al, CoNLL 2003. Interesting twist on the standard approach of token-tagging - NER by tagging characters and character n-grams.

Background Readings

LSP: chapter 1 (through 1.3) and section 3.3 (through 3.3.3); for some background on dynamic programming (buildup to Viterbi algorithm), see 2.3 (through 2.3.1, beyond if you're interested)
A Maximum Entropy Part-Of-Speech Tagger, Ratnaparkhi, Workshop on Very Large Corpora 1996
Mike Collins on learning in NLP, including a section on maxent taggers.
Dan Klein on maxent.
Bob Carpenter on maxent and SGD has detailed derivations of multiclass logistic regression and its gradients.

@@ Line 1: / Line 1: @@
 This is one of the class meetings on the [[Syllabus for Structured Prediction 10-710 in Fall 2011|schedule]] for the course [[Structured Prediction 10-710 in Fall 2011]].
-=== Hidden Markov models and Maxent Markov Models ===
+=== Hidden Markov Models as Structured Prediction ===
-* [http://www.cs.cmu.edu/~wcohen/10-707/09-20-mene+hmms.ppt Slides]
+* [http://www.cs.cmu.edu/~wcohen/10-710/09-01-hmms.ppt Slides in Powerpoint]
+* [http://www.cs.cmu.edu/~wcohen/10-710/09-01-hmms.pdf Slides in PDF]
 === Required Readings ===
@@ Line 12: / Line 13: @@
 === Optional Readings ===
-* [http://www.cs.cmu.edu/~wcohen/10-707/papers/bikel.pdf An Algorithm that Learns What's in a Name, Bikel ''et al'', MLJ 1999].  Another well-engineered and influential HMM-based NER system.
+* [http://www.cs.cmu.edu/~wcohen/10-707/papers/bikel.pdf An Algorithm that Learns What's in a Name, Bikel ''et al'', MLJ 1999]. ([[Paper::Bikel et al MLJ 1999|Wiki]])  Another well-engineered and influential HMM-based NER system.
 * [http://www.stanford.edu/~grenager/papers/unsupie_final.ps Unsupervised Learning of Field Segmentation Models for Information Extraction, Grenager, Klein, and Manning, ACL 2005].  Unsupervised segmentation paper.
 * [http://www-nlp.stanford.edu/~manning/papers/conll-ner.pdf Named Entity Recognition with Character-Level Models, Klein ''et al'', CoNLL 2003].  Interesting twist on the standard approach of token-tagging - NER by tagging characters and character n-grams.
@@ Line 18: / Line 19: @@
 === Background Readings ===
+* [http://www.morganclaypool.com/doi/abs/10.2200/S00361ED1V01Y201105HLT013 <i>LSP</i>]:  chapter 1 (through 1.3) and section 3.3 (through 3.3.3); for some background on dynamic programming (buildup to Viterbi algorithm), see 2.3 (through 2.3.1, beyond if you're interested)
 * [http://acl.ldc.upenn.edu/W/W96/W96-0213.pdf A Maximum Entropy Part-Of-Speech Tagger, Ratnaparkhi, Workshop on Very Large Corpora 1996]
-** I'm going to actually present substantial parts of this in class 9/21, so I'm taking it off the "optional" list - [[User:Wcohen|Wcohen]] 21:36, 19 September 2009 (UTC)
 * [http://www.ai.mit.edu/people/mcollins/papers/tutorial_colt.pdf Mike Collins on learning in NLP], including a section on maxent taggers.
 * [http://www-diglib.stanford.edu/~klein/maxent-tutorial-slides-6.pdf Dan Klein on maxent].
+* [http://lingpipe.files.wordpress.com/2008/04/lazysgdregression.pdf Bob Carpenter on maxent and SGD] has detailed derivations of multiclass logistic regression and its gradients.

Difference between revisions of "Class Meeting for 10-710 09-01-2011"

Latest revision as of 00:17, 28 September 2011

Contents

Hidden Markov Models as Structured Prediction

Required Readings

Optional Readings

Background Readings

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools