Charniak and Johnson 2005

From Cohen Courses
Revision as of 22:48, 13 November 2011 by Fkeith (talk | contribs)
Jump to navigationJump to search

This paper is a work in progress by Francis Keith

Citation

"Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative Reranking", Eugene Charniak, Mark Johnson, ACL 2005

Online Version

An online version of this paper can be found here: [1]

Summary

The authors describe a method that uses discriminative reranking for parsing. It uses the standard Charniak parser to generate the 50 best parse trees for a given sentence. They use the 50 parse trees in a MaxEnt reranker. The results show a small improvement over the state of the art (at the time), but is a significantly more efficient.

Getting the -best parses

They first describe using a simple extension to the dynamic programming algorithm to find the best parse (i.e. Viterbi), and instead store the best parses. The problem with this approach is that it increases the storage space significantly, especially if states are split a lot.

However, the parser they used solves the space problem by using a "coarse-to-fine" approach. They do this by using a "crude" grammar that only contains CFG features and parent and neighbor labels, which contains few enough states to be able to use a simple bottom up dynamic programming approach. Because of the coarseness of this parse, they get a polynomial sized forest. Each edge is pruned based on marginal probabilities conditioned on the string , if the probability given by the following equation falls below a threshold (they use )

The fine-grain model is then applied, but due to the smaller candidate set, the space size is manageable. During evaluation, the -best paths are kept at each state.

They produce results on section 23 of the WSJ for various numbers of best parses.

n 1 2 10 25 50
-score 0.897 0.914 0.948 0.960 0.968

Reranking the parses