Dreyer and Eisner, EMNLP 2006

From Cohen Courses
Revision as of 18:16, 27 November 2011 by Amr1 (talk | contribs) (Created page with ''''Better Informed Training of Latent Syntactic Features''' This [[Category::Paper|paper]] can be found at: [http://www.clsp.jhu.edu/~markus/dreyer+eisner.emnlp06.pdf] ==Citati…')
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Better Informed Training of Latent Syntactic Features

This paper can be found at: [1]

Citation

Markus Dreyer and Jason Eisner. Better informed training of latent syntactic features. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 317–326, Sydney, Australia, 2006. DOI: 10.3115/1610075.1610120

Summary

They wanted to improve parsing accuracy by introducing hidden features. It would seem that linguistic features (such as number, gender, etc) would help in constraining the parse trees. They wanted to include these features, but the treebank did not include the features. So, the authors used a modified EM algorithm with simulated annealing to find the features and then construct rules that took advantage of the features. The main contribution is an attempt to improve upon the work of Matsuzaki et al by reducing the number of degrees of freedom to learn so that the syntactic features can take on a greater range of values. The new method allows less freedom in learning transfer probabilities. The end result was no improvement over the previous work.

Previous work

As they say in the paper, "treebanks never contain enough information". Lots of parsing work had been done on splitting the nonterminals to be able to train the important nonterminal better. But the splitting was mostly done in an ad-hoc fashion until Matsuzaki et al (2005) with PCFG-LA (Probabilistic context-free grammar with latent annotations). They wanted to incorporate features that only propagated in "linguistically-motived ways".

Improvements in Paper

Experimental Results

Matsuzaki et al, ACL 2005