Nschneid writeup of McCallum 2000

This is Nschneid's review of Frietag_2000_Maximum_Entropy_Markov_Models_for_Information_Extraction_and_Segmentation

The MEMM paper. MEMMs are essentially linear-chain CRFs, but each state is conditioned on its observation and the previous state (whereas a CRF state is conditioned on all observations). Does a nice job of arguing for the advantages of conditioning on observations and for allowing overlapping features. Clear presentation of modified forward-backward, training with GIS, and some variants (Baum-Welch EM for semi-supervised case, and a reinforcement learning variant). Does not address the label bias problem.

Experiments on a FAQ segmentation-classification task with a small corpus show the MEMM fares better than several HMM variants. I would have liked to see more experiments, however, such as with POS tagging and NER.

Nschneid writeup of McCallum 2000

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools