Difference between revisions of "Stacked Sequential Learning"

From Cohen Courses
Jump to navigationJump to search
(Created page with 'This is a meta-learning [[Category::method]] that deals with the mismatch between training and testing data for sequential models. == Motivation == == Algorithm == == Varia…')
 
Line 1: Line 1:
This is a meta-learning [[Category::method]] that deals with the mismatch between training and testing data for sequential models.  
+
This is a meta-learning [[Category::method]] that deals with the mismatch between training and testing data for sequential models, proposed in [[RelatedPaper::Cohen and Carvalho, 2005]]. It stacks two stages of prediction, where the second stage makes use of the results of the first stage.
  
 
== Motivation ==
 
== Motivation ==
  
 +
Consider the general form of sequential prediction, in which we need to predict the label sequence <math>\mathbf{y} = (y_1, \ldots, y_n)</math> given the observation sequence <math>\mathbf{x} = (x_1, \ldots, x_n)</math>. The prediction of one label <math>y_i</math> will depend on neighboring labels, typically <math>y_{i-1}</math> and <math>y_{i+1}</math>. During training, we have the true neighboring labels; but during testing, <math>y_i</math> will be predicted based on the '''predicted''' neighboring labels. Due to reasons such as assumptions made by the model that do not exactly match the reality, there will be a '''mismatch''' between the distribution of the true and predicted neighboring labels, and this mismatch can result in degraded performance.
 +
 +
The solution is a two-stage approach: in the first stage, we train a base classifier using ''predicted'' labels instead of ''true'' labels; in the second stage, we train another classifier that learns from the mistakes made by the first classifier. The predicted labels for the training data are obtained with cross validation.
  
 
== Algorithm ==
 
== Algorithm ==
 +
  
  

Revision as of 16:33, 21 October 2011

This is a meta-learning method that deals with the mismatch between training and testing data for sequential models, proposed in Cohen and Carvalho, 2005. It stacks two stages of prediction, where the second stage makes use of the results of the first stage.

Motivation

Consider the general form of sequential prediction, in which we need to predict the label sequence given the observation sequence . The prediction of one label will depend on neighboring labels, typically and . During training, we have the true neighboring labels; but during testing, will be predicted based on the predicted neighboring labels. Due to reasons such as assumptions made by the model that do not exactly match the reality, there will be a mismatch between the distribution of the true and predicted neighboring labels, and this mismatch can result in degraded performance.

The solution is a two-stage approach: in the first stage, we train a base classifier using predicted labels instead of true labels; in the second stage, we train another classifier that learns from the mistakes made by the first classifier. The predicted labels for the training data are obtained with cross validation.

Algorithm

Variations

Time Complexity

Applications