Difference between revisions of "Eisner algorithm"

From Cohen Courses
Jump to navigationJump to search
 
(9 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
This [[Category::method]] is a widely-used dynamic-programming algorithm, and the basis for many papers, that addresses the problem of [[AddressesProblem::Dependency Parsing]]. It was introduced in [[RelatedPaper::Eisner COLING 1996]] [http://www.cs.jhu.edu/~jason/papers/eisner.coling96.pdf Three New Probabilistic Models for Dependency Parsing: An Exploration] and also described in [http://www.cs.jhu.edu/~jason/papers/eisner.iwptbook00.pdf Eisner 2000 Bilexical Grammars and Their Cubic-Time Parsing Algorithms].  
 
This [[Category::method]] is a widely-used dynamic-programming algorithm, and the basis for many papers, that addresses the problem of [[AddressesProblem::Dependency Parsing]]. It was introduced in [[RelatedPaper::Eisner COLING 1996]] [http://www.cs.jhu.edu/~jason/papers/eisner.coling96.pdf Three New Probabilistic Models for Dependency Parsing: An Exploration] and also described in [http://www.cs.jhu.edu/~jason/papers/eisner.iwptbook00.pdf Eisner 2000 Bilexical Grammars and Their Cubic-Time Parsing Algorithms].  
  
== Definition ==
+
== Background ==
  
 
For several tasks in the domain of Natural Language Processing, dependency grammars have proven to be very useful, specially because of the development of efficient parsing algorithms. Dependency grammars differ from phrase structure grammars in that they lack phrasal nodes: dependency structures are one-to-one relations in the sense that a word depends only on another word (the head), as shown in the next figure.  
 
For several tasks in the domain of Natural Language Processing, dependency grammars have proven to be very useful, specially because of the development of efficient parsing algorithms. Dependency grammars differ from phrase structure grammars in that they lack phrasal nodes: dependency structures are one-to-one relations in the sense that a word depends only on another word (the head), as shown in the next figure.  
Line 7: Line 7:
 
[[File:ea-01.png]]
 
[[File:ea-01.png]]
  
Most parsing algorithms are based on the idea that searching for the highest-scoring analysis tree of a sentence is more feasible by factoring those dependency trees into sets of ''parts'' that have limited interactions. By exploiting the contraints derived from the factorizations, summations and maximizations over the set of possible trees can be performed efficiently.
+
Most parsing algorithms are based on the idea that searching for the highest-scoring analysis tree of a sentence is more feasible by factoring those dependency trees into sets of ''parts'' that have limited interactions. By exploiting the constraints derived from the factorizations, summations and maximizations over the set of possible trees can be performed efficiently.
 +
 
 +
== Definition ==
  
 
The Eisner algorithm is based on a "first-order" factorization, which decomposes a dependency tree into its individual dependencies (the ''order'' of a part is defined as the number of the dependencies it contains). Two dynamic-programming structures are used: ''complete'' spans which consist of a head-word and its descendants on one side and ''incomplete'' spans which consist of a dependency and the region between the head and modifier.
 
The Eisner algorithm is based on a "first-order" factorization, which decomposes a dependency tree into its individual dependencies (the ''order'' of a part is defined as the number of the dependencies it contains). Two dynamic-programming structures are used: ''complete'' spans which consist of a head-word and its descendants on one side and ''incomplete'' spans which consist of a dependency and the region between the head and modifier.
Line 14: Line 16:
 
* Incomplete span: <math>\, I_{h,m} </math> where <math>\, h</math> and <math>\, m</math> are the index of the head and modifier of a dependency.
 
* Incomplete span: <math>\, I_{h,m} </math> where <math>\, h</math> and <math>\, m</math> are the index of the head and modifier of a dependency.
  
In other words, a complete span represents a "half-constituent" with <math>\, h</math> as the head, and and incomplete span represents a partial half-constituent, since it can be extended by adding more modifiers to <math>\, m</math>. Each kind of span is created by combining, recursively, two smaller spans that are adjacent, as shown in the next figure following the arrows upward. The parsing algorithm builds successively larger spans in a dynamic programming table or chart. The minimal spans used to seed the chart are linked or unlinked word bigrams, such as ''The->plan'' or ''tax ROOT''.
+
In other words, a complete span represents a "half-constituent" with <math>\, h</math> as the head, and and incomplete span represents a partial half-constituent, since it can be extended by adding more modifiers to <math>\, m</math>. Each kind of span is created by combining, recursively, two smaller spans that are adjacent, as shown in the next figure following the arrows upward. The parsing algorithm builds successively larger spans in a dynamic programming table or chart. The minimal spans used to seed the chart are linked or unlinked word bigrams, such as: <math>\, [ The \rightarrow plan] </math> or <math>\, [tax \, ROOT] </math> in the example.
  
 
[[File:ea-02.png]]
 
[[File:ea-02.png]]
  
An incomplete span is created from a pair of complete spans, indicating the division of the range <math>\, [h,m]</math> into constituents headed by <math>\, h</math> and <math>\, m</math>. A complete span is constructed by completing an incomplete span with the other half of the constituent of <math>\, m</math>. The next figure shows the constructions and their points of concatenation which are also called ''split points'', free indices that must be enumerated to find the optimal construction.
+
An incomplete span is created from a pair of complete spans, indicating the division of the range <math>\, [h,m]</math> into constituents headed by <math>\, h</math> and <math>\, m</math>. A complete span is constructed by completing an incomplete span with the other half of the constituent of <math>\, m</math>. The next figure shows the constructions and their points of concatenation, <math>\, m</math> in (a) and <math>\, r</math> in (b), which are also called ''split points'', free indices that must be enumerated to find the optimal construction. Complete spans are depicted as triangles and incomplete spans as trapezoids.
  
 
[[File:ea-03.png]]
 
[[File:ea-03.png]]
 +
 +
=== Pseudocode ===
 +
 +
Therefore, in order to parse a sentence, the algorithm needs to find optimal constructions for all the complete and incomplete spans defined on the sentence. The Eisner algorithm finds them by adapting standard chart-parsing techniques to the recursive derivations. The following pseudocode describes the adapted algorithm, which runs in cubic time and requires quadratic space because each derivation is defined by two fixed indices (the span boundaries) and a third free index (the split point).
 +
 +
[[File:ea-04.png]]
  
 
== Relevant Papers ==
 
== Relevant Papers ==

Latest revision as of 03:16, 26 November 2011

This method is a widely-used dynamic-programming algorithm, and the basis for many papers, that addresses the problem of Dependency Parsing. It was introduced in Eisner COLING 1996 Three New Probabilistic Models for Dependency Parsing: An Exploration and also described in Eisner 2000 Bilexical Grammars and Their Cubic-Time Parsing Algorithms.

Background

For several tasks in the domain of Natural Language Processing, dependency grammars have proven to be very useful, specially because of the development of efficient parsing algorithms. Dependency grammars differ from phrase structure grammars in that they lack phrasal nodes: dependency structures are one-to-one relations in the sense that a word depends only on another word (the head), as shown in the next figure.

Ea-01.png

Most parsing algorithms are based on the idea that searching for the highest-scoring analysis tree of a sentence is more feasible by factoring those dependency trees into sets of parts that have limited interactions. By exploiting the constraints derived from the factorizations, summations and maximizations over the set of possible trees can be performed efficiently.

Definition

The Eisner algorithm is based on a "first-order" factorization, which decomposes a dependency tree into its individual dependencies (the order of a part is defined as the number of the dependencies it contains). Two dynamic-programming structures are used: complete spans which consist of a head-word and its descendants on one side and incomplete spans which consist of a dependency and the region between the head and modifier.

  • Complete span: where and are the indices of the head-word and the endpoint of the span.
  • Incomplete span: where and are the index of the head and modifier of a dependency.

In other words, a complete span represents a "half-constituent" with as the head, and and incomplete span represents a partial half-constituent, since it can be extended by adding more modifiers to . Each kind of span is created by combining, recursively, two smaller spans that are adjacent, as shown in the next figure following the arrows upward. The parsing algorithm builds successively larger spans in a dynamic programming table or chart. The minimal spans used to seed the chart are linked or unlinked word bigrams, such as: or in the example.

Ea-02.png

An incomplete span is created from a pair of complete spans, indicating the division of the range into constituents headed by and . A complete span is constructed by completing an incomplete span with the other half of the constituent of . The next figure shows the constructions and their points of concatenation, in (a) and in (b), which are also called split points, free indices that must be enumerated to find the optimal construction. Complete spans are depicted as triangles and incomplete spans as trapezoids.

Ea-03.png

Pseudocode

Therefore, in order to parse a sentence, the algorithm needs to find optimal constructions for all the complete and incomplete spans defined on the sentence. The Eisner algorithm finds them by adapting standard chart-parsing techniques to the recursive derivations. The following pseudocode describes the adapted algorithm, which runs in cubic time and requires quadratic space because each derivation is defined by two fixed indices (the span boundaries) and a third free index (the split point).

Ea-04.png

Relevant Papers