Class Meeting for 10-707 9/20/2010
From Cohen Courses
Jump to navigationJump to search
This is one of the class meetings on the schedule for the course Information Extraction 10-707 in Fall 2010.
Contents
Hidden Markov models and Maxent Markov Models
Required Readings
- Borkar 2001 Automatic Segmentation of Text Into Structured Records
- Frietag 2000 Maximum Entropy Markov Models for Information Extraction and Segmentation
Optional Readings
- An Algorithm that Learns What's in a Name, Bikel et al, MLJ 1999. Another well-engineered and influential HMM-based NER system.
- Unsupervised Learning of Field Segmentation Models for Information Extraction, Grenager, Klein, and Manning, ACL 2005. Unsupervised segmentation paper.
- Named Entity Recognition with Character-Level Models, Klein et al, CoNLL 2003. Interesting twist on the standard approach of token-tagging - NER by tagging characters and character n-grams.
Background Readings
- A Maximum Entropy Part-Of-Speech Tagger, Ratnaparkhi, Workshop on Very Large Corpora 1996
- I'm going to actually present substantial parts of this in class 9/21, so I'm taking it off the "optional" list - Wcohen 21:36, 19 September 2009 (UTC)
- Mike Collins on learning in NLP, including a section on maxent taggers.
- Dan Klein on maxent.