K. Seymore et al, AAAI-99

From Cohen Courses
Revision as of 23:22, 27 September 2011 by Tkumar (talk | contribs) (→‎Method)
Jump to navigationJump to search


K. Seymore, A. McCallum, and R. Rosenfeld.Learning Hidden Markov Model structure for information extraction In Papers from the AAAI-99Workshop on Machine Learning for Information Extraction, pages 37-42, 1999

Online Version



In this Paper author explores the use of Hidden Markov Models for the Information tasks.Paper focuses on two tasks firstly how to learn the model from the Data Itself and it investigates the role of labeled and unlabeled Data in Model training.The paper also states that model which has multiple states per field outperforms the one with one state per field.The said model was then applied for extracting fields from Research Papers.


The paper proposes learning of the HMM structure from the data itself.Initially every word in the training data is treated as a state with transition to the neighboring state(word).There is one start state with transition to first word and end state with transition from end state.The paper proposes two merging techniques to merge the states of Method

1: Neighbor Merging - Merging two states if they are associated with same class and have transition link in Between

2: V Merging - Merging two state if they belong to same class and they transit to the same common state.

These merge are performed and the model structure is chosen which maximizes the probability of model given Data P(M|D) States are merged one by one until an optimal model is reached.

According to Bayes rule- P(M|D) = P(D|M)*P(M)

where P(D|M) can be calculated from data using Viterbi algorithm and P(M) can be chosen which gives more weight to shorter models.