Banko and Etzioni ACL 2008

From Cohen Courses
Jump to navigationJump to search

Citation

M. Banko and O. Etzioni. 2008. The Tradeoffs Between Open and Traditional Relation Extraction. In Proceedings of the 46th Annual Meeting of the Association For Computational Linguistics (Columbus, Ohio, June 15 - 20, 2008). ACL Workshops. Association for Computational Linguistics, 28--36.

Online version

ACL Anthology

Summary

This paper discusses the trade off of traditional IE and open IE using the TextRunner system. The authors claimed that the relationship between standard IE systems and the new Open IE paradigm is analogous to the relationship between lexicalized and unlexicalized parsers. Also, this paper proposed to replace the Naive Bayes classifier in the original TextRunner system with the CRF model. The improved TextRunner system achieved similar precision as standard IE systems but with low recall in the evaluation. Thus authors proposed to combine the results from standard IE and open IE systems to improve the recall.

The features used to train the CRF is the same as the original TextRunner. The CRF implementation is from Mallet and they use OpenNLP to get POS and chunking information from the sentences.

To combine the results from the standard IE and open IE systems, the authors used stacking technique.

The evaluation data was from a previous paper (Bunescu and Mooney. ACL 2007).

Related papers

This paper was based on the original TEXTRUNNER that was described in Banko_et_al_IJCAI_2007. This work was compared with another open IE system, WOE, later in Wu_and_Weld_ACL_2010.