Smith and Eisner 2008:Dependency parsing by belief propagation

Citation

Smith, David A. and Jason Eisner (2008). Dependency parsing by belief propagation. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 145-156, Honolulu, October.

Online version

Smith and Eisner 2008

Summary

This is a crucial paper that presents a loopy Belief Propagation (BP) method for Dependency Parsing, which can also be easily applied to general problems in Named Entity Recognition, Word Alignment, Shallow Parsing, and Constituent Parsing. The paper formulates the dependency parsing problem as a learning and decoding problem on a graphical model with global constraints. The authors show that BP needs only $O(n^{3})$ time to perform approximate inference on a graphical model, with second-order features and latent variables incorporated.

Brief Description of the Method

This paper first introduces the method of formulating the dependency parsing problem as training and decoding on Markov random fields, then discusses the use of Belief Propagation to lower asymptotic runtime during training and decoding. In this section, we will first summarize the method they use to formulate the problem, then briefly describe the method of using BP for this task.

Graphical Models for Dependency Parsing

Training and Decoding using BP

Dataset

Experimental Results

In the experiment section of this paper, the authors conducted three major experiments. First, they explored whether BP can beat Dynamic Programming (DP), in terms of the efficiency. Secondly, they looked at the non-projective parsing problem, and checked whether high-order features were useful, and how BP could make it tractable. Last but not the least, they have also examined whether global constraints contribute to the accuracy of dependency parsing, under this proposed BP framework.

Efficiency Evaluation: Comparing to Dynamic Programming (DP)

On Figure 2 and 3, it is clear that BP is much faster than DP under various settings. And when comparing Figure 2 and 3, it is shown that when adding a higher order (more complex model), the gap between BP and DP is widen.