Difference between revisions of "Daume ICML 2009"

From Cohen Courses
Jump to navigationJump to search
Line 10: Line 10:
  
 
An online version of the paper can be found here [http://www.umiacs.umd.edu/~hal/docs/daume09unsearn.pdf]
 
An online version of the paper can be found here [http://www.umiacs.umd.edu/~hal/docs/daume09unsearn.pdf]
 +
 +
== Summary ==
 +
 +
This paper details a methodology for [[AddressesProblem::Unsupervised Learning|unsupervised]] [[UsesMethod::SEARN|SEARN]]. It compares the results to other methods, first on synthetic data, and then on other methods for unsupervised [[AddressesProblem::Dependency Parsing|dependency parsing]].
 +
 +
== Algorithm ==
 +
 +
The basic [[UsesMethod::SEARN|SEARN]] algorithm is described.
 +
 +
In the supervised form, the algorithm uses a sample space of <math>(x,y)</math> pairs, where <math>x</math> is the input and <math>y</math> is the true output. In the unsupervised case, we must account for the fact that the algorithm must be run on an input of simply <math>x</math>, with the classifier still producing <math>y</math>.
 +
 +
The proposed solution is to essentially predict <math>y</math>, and then perform the normal prediction. The loss function will only be dependent on <math>x</math>.

Revision as of 22:05, 29 September 2011

A paper on using unsupervised SEARN

In progress by Francis Keith

Citation

"Unsupervised Search-based Structured Prediction", Hal Daume III, ICML 2009

Online Version

An online version of the paper can be found here [1]

Summary

This paper details a methodology for unsupervised SEARN. It compares the results to other methods, first on synthetic data, and then on other methods for unsupervised dependency parsing.

Algorithm

The basic SEARN algorithm is described.

In the supervised form, the algorithm uses a sample space of pairs, where is the input and is the true output. In the unsupervised case, we must account for the fact that the algorithm must be run on an input of simply , with the classifier still producing .

The proposed solution is to essentially predict , and then perform the normal prediction. The loss function will only be dependent on .