Difference between revisions of "Structured SVMs"
Line 2: | Line 2: | ||
== The Method and When to Use it == | == The Method and When to Use it == | ||
− | |||
− | + | Structured (or Structural) Support Vector Machines (SSVM), as the name states, is a machine learning model that generalizes the [[Support Vector Machines| Support Vector Machine (SVM)]] classifier, allowing training a classifier for structured output. | |
− | h | + | In general, SSVMs perform supervised learning by approximating a mapping <math>h</math> |
− | |||
− | + | <math> | |
− | + | h: X \rightarrow Y | |
+ | </math> | ||
+ | where <math>x = \{(x_1,y_1), ..., (x_n,y_n)\}</math> is a set of labeled training examples and <math>Y</math> is a complex structured object, like trees, sequences, or sets, instead of simple univariate predictions (as in the SVM case). | ||
+ | Thus, training a SSVM classifier consists of showing pairs of correct sample and output label pairs, that are used for training, allowing to predict for new sample instances the corresponding output label | ||
+ | |||
+ | In NLP one can fing a great variety of problems that rely on complex outputs, such as parsing and Markov Models for part-of-speech tagging. | ||
== The Algorithm == | == The Algorithm == |
Revision as of 15:36, 2 November 2011
Being edited by Rui Correia
The Method and When to Use it
Structured (or Structural) Support Vector Machines (SSVM), as the name states, is a machine learning model that generalizes the Support Vector Machine (SVM) classifier, allowing training a classifier for structured output.
In general, SSVMs perform supervised learning by approximating a mapping
where is a set of labeled training examples and is a complex structured object, like trees, sequences, or sets, instead of simple univariate predictions (as in the SVM case).
Thus, training a SSVM classifier consists of showing pairs of correct sample and output label pairs, that are used for training, allowing to predict for new sample instances the corresponding output label
In NLP one can fing a great variety of problems that rely on complex outputs, such as parsing and Markov Models for part-of-speech tagging.
The Algorithm
Slightly different version of the loss function:
Related Papers
- I. Tsochantaridis, T. Hofmann, T. Joachims, and Y. Altun. Support Vector Learning for Interdependent and Structured Output Spaces, ICML, 2004.
- Optimization Algorithms
- Taskar et al. (2003): SMO based on factored dual
- Bartlet et al. (2004) and Collins et al. (2008): exponentiated gradient
- Tsochantaridis et al. (2005): cutting planes (based on dual)
- Taskar et al. (2005): dual extragradient
- Ratliff et al. (2006): (stochastic) subgradient descent
- Crammer et al. (2006): online “passive‐aggressive” algorithms