Philgoo Han writeup of Cohen and Carvalho

From Cohen Courses
Jump to navigationJump to search

This is a review of Cohen_2005_stacked_sequential_learning by user:Ironfoot.

  • Meta learning algorithm for any base learner
  • High error rate for MEMM (for the particular training data)
    • Different from label bias or observation bias. Due to too strong weight on history feature.
    • Since training and test data are from the same source the two should have same characterisitcs. One reason I can think of the great error rate is a single false prediction makes all the following text line flase in MEMM. But can this single reason magnify the error rate more than ten times bigger?
  • Sequential stacking
    • Making a prediction of y
    • Why is "y^ similar to the prediction produced by an f learned by A on a size-m sample that does not include x"?
    • I don't get the meaning of "f will not be used as the inner loop of a Viterbi or beam-search"
  • Result
    • Much lower error rate. Using moderately large window(history) size improves the precision.
    • This may be an implementation issue. How do you handle the boundary conditions where there is not enough history or future states
  • Also work well on different data.
    • It seems that sequential stacking algorithm can improve CMM bias problem in general.