Difference between revisions of "Daume ICML 2009"
Line 10: | Line 10: | ||
An online version of the paper can be found here [http://www.umiacs.umd.edu/~hal/docs/daume09unsearn.pdf] | An online version of the paper can be found here [http://www.umiacs.umd.edu/~hal/docs/daume09unsearn.pdf] | ||
+ | |||
+ | == Summary == | ||
+ | |||
+ | This paper details a methodology for [[AddressesProblem::Unsupervised Learning|unsupervised]] [[UsesMethod::SEARN|SEARN]]. It compares the results to other methods, first on synthetic data, and then on other methods for unsupervised [[AddressesProblem::Dependency Parsing|dependency parsing]]. | ||
+ | |||
+ | == Algorithm == | ||
+ | |||
+ | The basic [[UsesMethod::SEARN|SEARN]] algorithm is described. | ||
+ | |||
+ | In the supervised form, the algorithm uses a sample space of <math>(x,y)</math> pairs, where <math>x</math> is the input and <math>y</math> is the true output. In the unsupervised case, we must account for the fact that the algorithm must be run on an input of simply <math>x</math>, with the classifier still producing <math>y</math>. | ||
+ | |||
+ | The proposed solution is to essentially predict <math>y</math>, and then perform the normal prediction. The loss function will only be dependent on <math>x</math>. |
Revision as of 21:05, 29 September 2011
A paper on using unsupervised SEARN
In progress by Francis Keith
Contents
Citation
"Unsupervised Search-based Structured Prediction", Hal Daume III, ICML 2009
Online Version
An online version of the paper can be found here [1]
Summary
This paper details a methodology for unsupervised SEARN. It compares the results to other methods, first on synthetic data, and then on other methods for unsupervised dependency parsing.
Algorithm
The basic SEARN algorithm is described.
In the supervised form, the algorithm uses a sample space of pairs, where is the input and is the true output. In the unsupervised case, we must account for the fact that the algorithm must be run on an input of simply , with the classifier still producing .
The proposed solution is to essentially predict , and then perform the normal prediction. The loss function will only be dependent on .