Koo and Collins ACL 2010

Citation

Koo, T. and Collins, M. 2010. Efficient Third-Order Dependency Parsers. In Proceedings of ACL, pp. 1-11. Association for Computational Linguistics.

Online version

ACL

Summary

This paper presents a higher-order dependency parser that can evaluate substructures containing three dependencies, using both sibling-style and grandchild-style interactions. The algorithms presented require only $O(n^{4})$ time and were evaluated on the Penn Treebank and the Prague Dependency Treebank. The implementation code was publicly released [1].

Background

Dependency parsing is defined as a search for the highest-scoring analysis of a sentence $\,x$ :

$y^{*}(x)=_{y\in Y(x)}^{argmax}{\textrm {Score}}(x,y)$

where $\,Y(x)$ is the set of all trees compatible with $\,x$ and $\,{\textrm {Score}}(x,y)$ evaluates the event that tree $\,y$ is the analysis of $\,x$ . Directly solving the equation is unfeasible because the number of possible trees grow exponentially with the length of the sentence. A common strategy is to factor each dependency tree into small parts which can be scored individually, then:

${\textrm {Score}}(x,y)=\sum _{p\in y}{\textrm {ScorePart}}(x,p)$

The order of a part is defined according to the number of dependencies it contains, a terminology used in previous parsing algorithms. The factorizations in this new third-order parser uses the following types of parts:

Description of new method

The overall method augments each span with a grandparent index. The concept of spans comes from the structures in the Eisner algorithm (a first-order parser), where complete spans consist of a head-word and its descendants on one side, and incomplete spans consist of a dependency and the region between the head and modifier. Based on this, the paper introduces three parsing algorithms based on three different models.

Model 0: all grandchildren

This model factors each dependency tree into a set of grandchild parts (pairs of dependencies connected head to tail). Formally, a triple of indices $\,(g,h,m)$ where $\,(g,h)$ and $\,(h,m)$ are dependencies. Both complete and incomplete spans are augmented with grand-parent indices to produced g-spans. The next figure describes complete and incomplete g-spans and provides a graphical specification of the dynamic-programming algorithm:

Model 0 parsing algorithm is similar to Eisner algorithm, but every recursive construction sets the grandparent indices of the smaller g-spans. Model 0 can be parsed by adapting standard top-down or bottom-up chart parsing techniques. The following pseudocode sketches a bottom-up chart parser for this model:

The complexity of the algorithm is $O(n^{4})$ time and $O(n^{3})$ space because each derivation is defined by three fixed indices (the g-span boundaries) and a third free index (the split point). This is still a second-order parsing algorithm but the next models (1 and 2) are third-order parsers.

Model 1: all grand-siblings

This model decomposes each tree into a set of grand-sibling parts: 4-tuples of indices $\,(g,h,m,s)$ where $\,(h,m,s)$ is a sibling part and $\,(g,h,m)$ and $\,(g,h,s)$ are grandchild parts. These factorizations are parsed using sibling g-spans, as graphically described in the next figure:

The parsing algorithm for Model 1 is similar to the second-order sibling parser with the addition of grandparent indices. Like Model 0, Model 1 can also be parsed adapting standard chart-parsing techniques and despite using third-order parts, it also requires only $O(n^{4})$ time and $O(n^{3})$ space.

Model 2: grand-siblings and tri-siblings

This model captures both grand-sibling parts and tri-sibling parts: 4-tuples of indices $\,(h,m,s,t)$ where both $\,(h,m,s)$ and $\,(h,s,t)$ are sibling parts. These factorizations are parsed using a new-type of structure called s-spans (sibling-augmented spans), formally defined as $\,I_{h,m,s}$ where $\,I_{h,m}$ is a normal incomplete span and $\,s$ is an index lying in the strict interior of the range $\,[h,m]$ where $\,(h,m,s)$ forms a valid sibling part. The next figure describes Model 2 parsing algorithm, which also requires only $O(n^{4})$ time and $O(n^{3})$ space:

Experimental results

The accuracy of Models 1 and 2 were evaluated on the English and Czech test sets, and compared to relevant results from related work. Some models (marked with †) are not directly comparable because they depend on additional resources that were not used for training.

Related papers

The third-order dependency parsers build on ideas from existing parsing algorithms, specifically, the first-order Eisner algorithm introduced in Eisner COLING 1996 and the second-order sibling parser described in McDonald and Pereira EACL 2006.

Dependency parsing has been an active research subject and several techniques have been proposed, such as McDonald et al, ACL 2005: Non-Projective Dependency Parsing Using Spanning Tree Algorithms, Smith and Eisner 2008:Dependency parsing by belief propagation and Daume ICML 2009.

Koo and Collins ACL 2010

Contents

Citation

Online version

Summary

Background

Description of new method

Model 0: all grandchildren

Model 1: all grand-siblings

Model 2: grand-siblings and tri-siblings

Experimental results

Related papers

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools