Selen writeup of McCallum et al.

From Cohen Courses
Revision as of 10:42, 3 September 2010 by WikiAdmin (talk | contribs) (1 revision)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

This is a review of Frietag_2000_Maximum_Entropy_Markov_Models_for_Information_Extraction_and_Segmentation by user:Selen.

In this paper, they recover the underlying state sequence given a sequence of observations, using maximum entropy markov models. They test their method on a collection of FAQ documents and for each document they try to predict the corresponding section for each line: head, question, answer and tail. For this task they use 24 boolean features, associated with each line. Then they compare their performance to HMMs.

What I like about this paper, is that they give an extensive comparison with alternative methods, stateless ME/MEMM, token HMM/feature HMM.

What I don't like about his paper is that they say the 24 boolean features are "non-independent" however they don't provide the dependency between features. They claim that an HMM model is not appropriate for a task like this, but I think the fact that one model is generative the other is discriminative doesn't mean that it will be inappropriate to use them interchangeably for the same task. Using leave n minus 1 out would be more challenging for HMM than it is to MEMM so I would like to see the results for different cross validation values.

  • I agree - it would be interesting to see some sort of experiment where they explicitly test the "correlated features are bad for HMMs" conjecture. - Wcohen 14:33, 24 September 2009 (UTC)