Class Meeting for 10-707 9/27/2010
From Cohen Courses
Jump to navigationJump to searchThis is one of the class meetings on the schedule for the course Information Extraction 10-707 in Fall 2010.
Meta-Learning: Stacking and Sequential Models
The notes also have a short review of last week's session on CRFs.
- Slides
- Additional notes on Sha & Pereira, including derivation of the gradient of the loglikelihood for CRFs].
Required Readings
- Stacked sequential learning, by William W. Cohen, Vitor Carvalho. In International Joint Conference on Artificial Intelligence, 2005.. A few people find it awkward to review a paper the instructor wrote, so if you prefer, just answer this question:
- If you used stacked sequential learning with naive Bayes as the base learning and K=10, how many times slower would it be than just running the base learner?
Optional Readings
- An Effective Two-Stage Model for Exploiting Non-Local Dependencies in Named Entity Recognition, Krishnan and Manning, ACL 2006. Another take on stacked sequential learning.
- Zhenzhen Kou and William W. Cohen (2007): Stacked Graphical Models for Efficient Inference in Markov Random Fields in SDM-2007. Extended version of the stacked sequential learning method that applies to arbitrary graphs.
- Transformation-Based Error-Driven Learning and Natural Language Processing, Brill, COLING 1995. The learning algorithm in the Brill parser, which has also been used for NER (e.g., in Abgene).
- Search-based Structured Prediction, Daume, Langford, and Marcu, Machine Learning Journal (2009). Another clever meta-learning algorithm that works well for sequences.
- Conditional graphical models, Perez-Cruz & Ghahramani, 2007, in Predicting Structured Data. MIT Press, Cambridge, MA, USA, pp. 265-282.. A very simple and effective meta-learning method.