Structured SVMs
Being edited by Rui Correia
The Method and When to Use it
Tsochantaridis et al. (2005) – extends their 2004 paper
SVMstruct is a Support Vector Machine (SVM) algorithm for predicting multivariate or structured outputs. It performs supervised learning by approximating a mapping
h: X --> Y using labeled training examples (x1,y1), ..., (xn,yn). Unlike regular SVMs, however, which consider only univariate predictions like in classification and regression, SVMstruct can predict complex objects y like trees, sequences, or sets. Examples of problems with complex outputs are natural language parsing, sequence alignment in protein homology detection, and markov models for part-of-speech tagging. The SVMstruct algorithm can also be used for linear-time training of binary and multi-class SVMs under the linear kernel [4].
The structured support vector machine is a machine learning algorithm that generalizes the Support Vector Machine (SVM) classifier. Whereas the SVM classifier supports binary classification, multiclass classification and regression, the structured SVM allows training of a classifier for general structured output labels. As an example, a sample instance might be a natural language sentence, and the output label is an annotated parse tree. Training a classifier consists of showing pairs of correct sample and output label pairs. After training, the structured SVM model allows one to predict for new sample instances the corresponding output label; that is, given a natural language sentence, the classifier can produce the most likely parse tree.
The Algorithm
Slightly different version of the loss function:
Related Papers
- I. Tsochantaridis, T. Hofmann, T. Joachims, and Y. Altun. Support Vector Learning for Interdependent and Structured Output Spaces, ICML, 2004.
- Optimization Algorithms
- Taskar et al. (2003): SMO based on factored dual
- Bartlet et al. (2004) and Collins et al. (2008): exponentiated gradient
- Tsochantaridis et al. (2005): cutting planes (based on dual)
- Taskar et al. (2005): dual extragradient
- Ratliff et al. (2006): (stochastic) subgradient descent
- Crammer et al. (2006): online “passive‐aggressive” algorithms