Rbosaghz writeup of Cohen and Carvalho
Cohen_2005_stacked_sequential_learning by user:Rbosaghz
Sequential stacked learning
This paper takes an arbitrary learner, such as CRF or HMM, for the task of partitioning a sequence of tasks (say, video frames or document words), and augments the learner with labels of nearby examples.
Their approach is motivated by the fact that MEMMs perform extremely badly on the signature-detection problem, with an error rate many times the error rate of CRFs. The authors attribute this poor performance to the differences between training and test data. To correct the training/test mismatch, they modify an extended dataset so that the true previous class in a sequence is replaced by a predicted previous class. To show their stacked models hold merit, the authors point out that a sequentially stacked maxent learner outperforms CRFs.