Tratz and Hovy, EMNLP 2011
Contents
Citation
Stephen Tratz and Eduard Hovy. A Fast, Effective, Non-Projective, Semantically-Enriched Parser. EMNLP 2011.
Online Version
http://aclweb.org/anthology/D/D11/D11-1116.pdf
Summary
The gist of this Paper is that they took a fast, easy-first dependency parsing algorithm and made it a little bit better. "Easy-first" means that at each step of the algorithm you find the best scoring attachment and do that one, so you can implement the parsing algorithm in O(nlogn) time, which is really pretty good, considering standard constituency parsers are O(n^3). They have a slightly inefficient, but easier, implementation that actually runs in O(n^2) instead of O(nlogn), but it still can parse 75 sentences a second, with fully labeled dependencies, which is really quite impressive (I've used the Stanford dependency parser, and it was 1 sentence per second or two).
Methods
They made their baseline algorithm better by adding a step to handle non-projective dependencies (about 8% of the sentences in their data - the Penn treebank - had at least one non-projective link), by adding labels to the dependencies (the Stanford dependency labels plus some of their own tweaks), and by adding more features. Sadly, they say, "One of the key aspects of the parser is the complex set of features used," and then they don't actually describe their features in the paper, making the reader download and browse through their code to figure out what they are actually doing.
They train their model using a fairly typical structured perceptron, it looks like.
Experimental Results
Their results are impressive - they unquestionably beat all other algorithms they presented, using a fraction of the time. The Stanford dependency parser was noticeably missing from their presentation, however, and it wasn't clear to me why that was. They ended up with 93.7% unlabeled arc accuracy. The baseline was 91.2%, and almost all of the improvement came from the features that they failed to describe in the paper - without using the non-projective addition, only the extra features, they get 93.3%, only .4% lower than their full model. They do perform way better on non-projective arcs, though, going from 21.1% to 67.7% accuracy.
Other notes
The "Semantically-Enriched" part of their parser just meant, "and we can run some cool stuff at the end, too, to give you more information about the sentences." That's nice, but it seemed really orthogonal to the point of the paper, just taking up space that should have been devoted to describing their features that did so well. Their presentation about that was basically, "We used these other components that are explained in these other papers, and by the way, the accuracy is about the same as it was then."
Comment
Missing the comparison to Stanford Parser is a problem. They had a paper where they found their strategy (constituent parse, then rule-based dependency conversion), worked better than directly using a dependency parser -- though I suspect they compared to overly limited dependency parsers, at least according to the evaluations on their webpage. SDeps can be non-projective (and sometimes not even trees! depending what options you select), so maybe easy-first should be better than shift-reduce or MST parsers, because it is more flexible in the graph theoretic properties of the structures it can predict. --Brendan 18:13, 29 November 2011 (UTC)