Difference between revisions of "Taskar et al. 2004. Max-margin Parsing"

From Cohen Courses
Jump to navigationJump to search
m
m
Line 28: Line 28:
  
 
For each sentence, we need to enumerate all possible parse trees, which is exponential in size. However, we can make use of local substructures similar to chart parsing dynamic programming algorithms to factor these trees into parts like <math>\langle A,s,e,i\rangle</math> and <math>\langle A\rightarrow B C,s,m,e,i\rangle</math>, where <math>s,m,e,i</math> refers to start, split, end points and sentence number respectively.
 
For each sentence, we need to enumerate all possible parse trees, which is exponential in size. However, we can make use of local substructures similar to chart parsing dynamic programming algorithms to factor these trees into parts like <math>\langle A,s,e,i\rangle</math> and <math>\langle A\rightarrow B C,s,m,e,i\rangle</math>, where <math>s,m,e,i</math> refers to start, split, end points and sentence number respectively.
 +
 +
Therefore,
 +
 +
<math>\Phi(x,y)=\sum_{r\in R(x,y}}\phi(x,r)</math>
 +
 +
where <math>R(x,y}</math> is the set of all possible parts. <math>\phi</math> can be any function that maps a rule production part to some feature vector representation.
  
 
== Related Papers ==
 
== Related Papers ==
  
 
In [[RelatedPaper::Bartlett et al NIPS 2004]], they used the EG algorithm for large margin structured classification.
 
In [[RelatedPaper::Bartlett et al NIPS 2004]], they used the EG algorithm for large margin structured classification.

Revision as of 18:02, 30 October 2011

Max-margin parsing, by Ben Taskar, Taskar, B. and Klein, D. and Collins, M. and Koller, D. and Manning, C.. In Proc. EMNLP, 2004.

This Paper is available online [1].

Summary

This paper presents a novel approach to parsing by maximizing separating margins using SVMs. They show how we can reformulate the parsing problem as a discriminative task, which allow an arbitrary number of features to be used. Also, such a formulation allows them to incorporate a loss function that directly penalizes incorrect parse trees appropriately.

Brief description of the method

Instead of a probabilistic interpretation for parse trees, we seek to find:

for all sentences in the training data, being the parse tree, the set of possible parses for .

Formulating it as an optimization problem,

Using SVM, we can find the dual of the above program

s.t

where indicates whether is the true parse for sentence

For each sentence, we need to enumerate all possible parse trees, which is exponential in size. However, we can make use of local substructures similar to chart parsing dynamic programming algorithms to factor these trees into parts like and , where refers to start, split, end points and sentence number respectively.

Therefore,

Failed to parse (syntax error): {\displaystyle \Phi(x,y)=\sum_{r\in R(x,y}}\phi(x,r)}

where Failed to parse (syntax error): {\displaystyle R(x,y}} is the set of all possible parts. can be any function that maps a rule production part to some feature vector representation.

Related Papers

In Bartlett et al NIPS 2004, they used the EG algorithm for large margin structured classification.