Difference between revisions of "Structured SVMs"

From Cohen Courses
Jump to navigationJump to search
Line 17: Line 17:
 
In NLP one can fing a great variety of problems that rely on complex outputs, such as parsing and Markov Models for part-of-speech tagging.
 
In NLP one can fing a great variety of problems that rely on complex outputs, such as parsing and Markov Models for part-of-speech tagging.
  
== The Algorithm ==  
+
== Training ==  
  
For a set of training instances <math>(x_n, y_n) \in X \times Y, n=1,...,l</math>, the SSVM minimizes the risk function:
+
While on training, for a set of samples and labels <math>(x_n, y_n) \in X \times Y, n=1,...,l</math>, the SSVM minimizes the risk function:
  
  
Line 26: Line 26:
 
</math>
 
</math>
  
where <math>\Delta</math> and <math>\Psi</math>
+
where <math>\Delta</math> is an arbitrary function, which measures the distance between to labels and <math>\Psi</math> is a function on samples and labels, which extracts feature vectors.
 +
 
 +
 
 
Since the regularized risk function above is non-differentiable, it is often reformulated in terms of a quadratic program by introducing one slack variables <math>\xi_i</math> for each sample, each representing the value of the maximum. The standard structured SVM primal formulation is given as follows.
 
Since the regularized risk function above is non-differentiable, it is often reformulated in terms of a quadratic program by introducing one slack variables <math>\xi_i</math> for each sample, each representing the value of the maximum. The standard structured SVM primal formulation is given as follows.
  

Revision as of 15:58, 2 November 2011

Being edited by Rui Correia

The Method and When to Use it

Structured (or Structural) Support Vector Machines (SSVM), as the name states, is a machine learning model that generalizes the Support Vector Machine (SVM) classifier, allowing training a classifier for structured output.

In general, SSVMs perform supervised learning by approximating a mapping

where is a set of labeled training examples and is a complex structured object, like trees, sequences, or sets, instead of simple univariate predictions (as in the SVM case).

Thus, training a SSVM classifier consists of showing pairs of correct sample and output label pairs, that are used for training, allowing to predict for new sample instances the corresponding output label

In NLP one can fing a great variety of problems that rely on complex outputs, such as parsing and Markov Models for part-of-speech tagging.

Training

While on training, for a set of samples and labels , the SSVM minimizes the risk function:


where is an arbitrary function, which measures the distance between to labels and is a function on samples and labels, which extracts feature vectors.


Since the regularized risk function above is non-differentiable, it is often reformulated in terms of a quadratic program by introducing one slack variables for each sample, each representing the value of the maximum. The standard structured SVM primal formulation is given as follows.

Slightly
 different 
version 
of 
the
 loss 
function:


      

Related Papers