Difference between revisions of "Structured SVMs"
Line 2: | Line 2: | ||
== The Method and When to Use it == | == The Method and When to Use it == | ||
+ | Tsochantaridis
et
al.
(2005)
–
extends
their
2004
paper
| ||
+ | |||
+ | SVMstruct is a Support Vector Machine (SVM) algorithm for predicting multivariate or structured outputs. It performs supervised learning by approximating a mapping | ||
+ | |||
+ | h: X --> Y | ||
+ | using labeled training examples (x1,y1), ..., (xn,yn). Unlike regular SVMs, however, which consider only univariate predictions like in classification and regression, SVMstruct can predict complex objects y like trees, sequences, or sets. Examples of problems with complex outputs are natural language parsing, sequence alignment in protein homology detection, and markov models for part-of-speech tagging. The SVMstruct algorithm can also be used for linear-time training of binary and multi-class SVMs under the linear kernel [4]. | ||
+ | |||
+ | The structured support vector machine is a machine learning algorithm that generalizes the Support Vector Machine (SVM) classifier. Whereas the SVM classifier supports binary classification, multiclass classification and regression, the structured SVM allows training of a classifier for general structured output labels. | ||
+ | As an example, a sample instance might be a natural language sentence, and the output label is an annotated parse tree. Training a classifier consists of showing pairs of correct sample and output label pairs. After training, the structured SVM model allows one to predict for new sample instances the corresponding output label; that is, given a natural language sentence, the classifier can produce the most likely parse tree. | ||
+ | |||
+ | |||
== The Algorithm == | == The Algorithm == | ||
− | = | + | Slightly
different
version
of
the
loss
function:
|
+ | |||
+ | <math> | ||
+ | \min_i \frac{C}{2} ||w||{^{2}}{_{2}} + \sum^N_{i=1} \xi_i | ||
+ | </math> | ||
+ | <math> | ||
+ | s.t. \forall i, \forall y, w^T g (x_i,y) \ge +1 - \frac {\xi_i}{cost(y, y_i)} | ||
+ | </math> | ||
== Related Papers == | == Related Papers == | ||
+ | *[http://svmlight.joachims.org/svm_struct.html| I. Tsochantaridis, T. Hofmann, T. Joachims, and Y. Altun. Support Vector Learning for Interdependent and Structured Output Spaces, ICML, 2004.] | ||
+ | * Optimization Algorithms | ||
+ | **Taskar
et
al.
(2003):
SMO
based
on
factored
dual
| ||
+ | **Bartlet
et
al.
(2004)
and
Collins
et
al.
(2008):
exponentiated
gradient
| ||
+ | **Tsochantaridis
et
al.
(2005):
cutting
planes
(based
on
dual)
| ||
+ | **Taskar
et
al.
(2005):
dual
extragradient
| ||
+ | **Ratliff
et
al.
(2006):
(stochastic)
subgradient
descent
| ||
+ | **Crammer
et
al.
(2006):
online
“passive‐aggressive”
algorithms |
Revision as of 15:08, 2 November 2011
Being edited by Rui Correia
The Method and When to Use it
Tsochantaridis et al. (2005) – extends their 2004 paper
SVMstruct is a Support Vector Machine (SVM) algorithm for predicting multivariate or structured outputs. It performs supervised learning by approximating a mapping
h: X --> Y using labeled training examples (x1,y1), ..., (xn,yn). Unlike regular SVMs, however, which consider only univariate predictions like in classification and regression, SVMstruct can predict complex objects y like trees, sequences, or sets. Examples of problems with complex outputs are natural language parsing, sequence alignment in protein homology detection, and markov models for part-of-speech tagging. The SVMstruct algorithm can also be used for linear-time training of binary and multi-class SVMs under the linear kernel [4].
The structured support vector machine is a machine learning algorithm that generalizes the Support Vector Machine (SVM) classifier. Whereas the SVM classifier supports binary classification, multiclass classification and regression, the structured SVM allows training of a classifier for general structured output labels. As an example, a sample instance might be a natural language sentence, and the output label is an annotated parse tree. Training a classifier consists of showing pairs of correct sample and output label pairs. After training, the structured SVM model allows one to predict for new sample instances the corresponding output label; that is, given a natural language sentence, the classifier can produce the most likely parse tree.
The Algorithm
Slightly different version of the loss function:
Related Papers
- I. Tsochantaridis, T. Hofmann, T. Joachims, and Y. Altun. Support Vector Learning for Interdependent and Structured Output Spaces, ICML, 2004.
- Optimization Algorithms
- Taskar et al. (2003): SMO based on factored dual
- Bartlet et al. (2004) and Collins et al. (2008): exponentiated gradient
- Tsochantaridis et al. (2005): cutting planes (based on dual)
- Taskar et al. (2005): dual extragradient
- Ratliff et al. (2006): (stochastic) subgradient descent
- Crammer et al. (2006): online “passive‐aggressive” algorithms