# Difference between revisions of "Stacked Sequential Learning"

This is a meta-learning method that deals with the mismatch between training and testing data for sequential models, proposed in Cohen and Carvalho, 2005. It stacks two stages of prediction, where the second stage makes use of the results of the first stage.

## Motivation

Consider the general form of sequential prediction, in which we need to predict the label sequence ${\displaystyle \mathbf {y} =(y_{1},\ldots ,y_{n})}$ given the observation sequence ${\displaystyle \mathbf {x} =(x_{1},\ldots ,x_{n})}$. The prediction of one label ${\displaystyle y_{i}}$ will depend on neighboring labels, typically ${\displaystyle y_{i-1}}$ and ${\displaystyle y_{i+1}}$. During training, we have the true neighboring labels; but during testing, ${\displaystyle y_{i}}$ will be predicted based on the predicted neighboring labels. Due to reasons such as assumptions made by the model that do not exactly match the reality, there will be a mismatch between the distribution of the true and predicted neighboring labels, and this mismatch can result in degraded performance.

The solution is a two-stage approach: in the first stage, we train a base classifier using predicted labels instead of true labels; in the second stage, we train another classifier that learns from the mistakes made by the first classifier. The predicted labels for the training data are obtained with cross validation.