Difference between revisions of "McDonald et al, ACL 2005: Non-Projective Dependency Parsing Using Spanning Tree Algorithms"

From Cohen Courses
Jump to navigationJump to search
(Created page with '== Citation == R. McDonald, F. Pereira, K. Ribarov, J. Hajič. '''Non-projective dependency parsing using spanning tree algorithms''', ''Proceedings of Human Language Technology…')
 
Line 8: Line 8:
  
 
== Summary ==
 
== Summary ==
 +
 +
This [[Category::paper]] addresses the problem of non-projective [[AddressesProblem::Dependency Parsing|dependency prasing]].
 +
 +
Given a sentence <math>\mathbf{x} = (x_1, \ldots, x_n)</math>, we can construct a directed graph <math>\mathbf{G_x} = (V,E)</math>. The vertex set <math>V</math> contains one vertex <math>v_i</math> for each word <math>x_i</math> in the sentence, plus a dummy vertex <math>v_0</math> for the "root". The edge set <math>E</math> contains all directed edges of the form <math>(v_i, v_j)</math>, where <math>0 \le i \le n</math>, <math>1 \le j \le n</math>, and <math>i \ne j</math>. Each edge has a score of the form <math>s(i,j) = \mathbf{w} \cdot \mathbf{f}(i,j)</math>, where <math>\mathbf{f}(i,j)</math> is a feature vector depending on the words <math>x_i</math> and <math>x_j</math>, and <math>\mathbf{w}</math> is a weight factor.
 +
 +
A (non-projective) dependency parse tree <math>\mathbf{y}</math> is a subgraph of <math>\mathbf{G_x}</math> which covers all the vertices in <math>V</math>, and in which each vertex has exactly one predecessor (except for the root vertex which has no predecessor). The score of a dependency tree <math>\mathbf{y}</math> is factored as the sum of the scores of its edges:
 +
<math>s(\mathbf{x}, \mathbf{y}) = \sum_{(i,j) \in \mathbf{y}} \mathbf{w} \cdot \mathbf{f}(i,j)</math>
 +
 +
The '''decoding problem''' is to find the max-scoring dependency parse tree <math>\mathbf{y}</math> given a sentence <math>\mathbf{x}</math>, assuming that the weight vector <math>\mathbf{w}</math> is known. The '''learning problem''' is to find an optimal weight vector <math>\mathbf{w}</math>, such that the sum of the scores of the dependency parse trees in a training corpus is maximized.
 +
 +
=== The Decoding Problem ===
 +
 +
=== The Learning Problem ===
 +
  
 
== Experiments ==
 
== Experiments ==

Revision as of 17:28, 22 October 2011

Citation

R. McDonald, F. Pereira, K. Ribarov, J. Hajič. Non-projective dependency parsing using spanning tree algorithms, Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, pp. 523-530, Vancouver, October 2005.

Online Version

PDF version

Summary

This paper addresses the problem of non-projective dependency prasing.

Given a sentence , we can construct a directed graph . The vertex set contains one vertex for each word in the sentence, plus a dummy vertex for the "root". The edge set contains all directed edges of the form , where , , and . Each edge has a score of the form , where is a feature vector depending on the words and , and is a weight factor.

A (non-projective) dependency parse tree is a subgraph of which covers all the vertices in , and in which each vertex has exactly one predecessor (except for the root vertex which has no predecessor). The score of a dependency tree is factored as the sum of the scores of its edges:

The decoding problem is to find the max-scoring dependency parse tree given a sentence , assuming that the weight vector is known. The learning problem is to find an optimal weight vector , such that the sum of the scores of the dependency parse trees in a training corpus is maximized.

The Decoding Problem

The Learning Problem

Experiments