KeisukeKamataki writeup of writeup of Cohen 2005

Summary: They proposed a meta-learning algorithm for sequential data called stacked sequential learning. Since this algorithm is a kind of general framework like boosting, it can be combined with any sequential learner such as CRF or MEMM. The key idea is to generate extended dataset using cross-validations and inject predicted labels for each sequence of each extended dataset and try to make use of all information (observed data, history/future predicted labels) for building a model. They used MaxEnt to generate a set of predicted labels set. This algorithm increases learning time by approximately to K+2 from original learning time where K is the parameter for cross-validation and 2 for building a final classifier. When this increase of time is a problem, it would be helpful to split original data into just 2 halves and train them respectively. This method greatly improved the original performance of the learners when it is combined with both other non-sequential/sequential learners.

I like: Clear about the solid error analysis of MEMM (which tends to be suffered from local correlation between adjacent labels) and their own solution.
I like: Highly adaptable to any sequential learner.

Navigation menu