Taskar et al. 2004. Max-margin Parsing

From Cohen Courses
Revision as of 17:02, 30 October 2011 by Ysim (talk | contribs)
Jump to navigationJump to search

Max-margin parsing, by Ben Taskar, Taskar, B. and Klein, D. and Collins, M. and Koller, D. and Manning, C.. In Proc. EMNLP, 2004.

This Paper is available online [1].

Summary

This paper presents a novel approach to parsing by maximizing separating margins using SVMs. They show how we can reformulate the parsing problem as a discriminative task, which allow an arbitrary number of features to be used. Also, such a formulation allows them to incorporate a loss function that directly penalizes incorrect parse trees appropriately.

Brief description of the method

Instead of a probabilistic interpretation for parse trees, we seek to find:

for all sentences in the training data, being the parse tree, the set of possible parses for .

Formulating it as an optimization problem,

Using SVM, we can find the dual of the above program

s.t

where indicates whether is the true parse for sentence

For each sentence, we need to enumerate all possible parse trees, which is exponential in size. However, we can make use of local substructures similar to chart parsing dynamic programming algorithms to factor these trees into parts like and , where refers to start, split, end points and sentence number respectively.

Therefore,

where Failed to parse (syntax error): {\displaystyle R(x,y}} is the set of all possible parts. can be any function that maps a rule production part to some feature vector representation.

Related Papers

In Bartlett et al NIPS 2004, they used the EG algorithm for large margin structured classification.