K. Seymore et al, AAAI-99
Contents
Citation
K. Seymore, A. McCallum, and R. Rosenfeld.Learning Hidden Markov Model structure for information extraction In Papers from the AAAI-99Workshop on Machine Learning for Information Extraction, pages 37-42, 1999
Online Version
Summary
In this Paper author explores the use of Hidden Markov Models for the Information tasks.Paper focuses on two tasks firstly how to learn the model from the Data Itself and it investigates the role of labeled and unlabeled Data in Model training.The paper also states that model which has multiple states per field outperforms the one with one state per field.The said model was then applied for extracting fields from Research Papers.
Method
The paper proposes learning of the HMM structure from the data itself.Initially every word in the training data is treated as a state with transition to the neighboring state(word).There is one start state with transition to first word and end state with transition from end state.The paper proposes two merging techniques to merge the states of Method
1: Neighbor Merging - Merging two states if they are associated with same class and have transition link in Between
2: V Merging - Merging two state if they belong to same class and they transit to the same common state.
These merge are performed and the model structure is chosen which maximizes the probability of model given Data P(M|D) States are merged one by one until an optimal model is reached.
According to Bayes rule- P(M|D) = P(D|M)*P(M)
where P(D|M) can be calculated from data using Viterbi algorithm and P(M) can be chosen which gives more weight to shorter models.