Sgardine writesup Frietag et al 2000

From Cohen Courses
Jump to navigationJump to search

This is a review of Frietag_2000_Maximum_Entropy_Markov_Models_for_Information_Extraction_and_Segmentation by user:sgardine.

Synopsis

HMMs exhibit two flaws in sequential NL tasks: they do not naturally represent phenomena with overlapping nonindependent features, and they expend representative and computational power on the probabilities of a joint model including the known observations. MEMMs address the two issues by specifying a conditional distribution in terms of an exponential model based on observation features. Training of MEMMs is discussed using GIS. MEMM is evaluated on the task of segmenting FAQs against stateless MaxEnt, standard token HMMs, and an HMM using feature. MEMM outperforms all models, with FeatureHMM second, suggesting that the use of features contributes heavily to success.

Commentary

I liked the careful distinction between the two flaws and appreciated the experiments on the FeatureHMM to delineate the advantages due to features with those due to conditionality.