Hovy et al., EMNLP 2009

From Cohen Courses
Jump to navigationJump to search

Citation

Eduard Hovy, Zornitsa Kozareva, and Ellen Riloff. 2009. Toward completeness in concept extraction and classification. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: (EMNLP '09), Vol. 2. Association for Computational Linguistics, Morristown, NJ, USA, 948-957.

Online version

ACL Anthology

Summary

This paper proposed a method to automatically extract terms that are related to the given concepts and the taxonomic organization (is-a links) of the extracted terms. Essentially, the authors tried to build the taxonomy below the given concepts. The method starts with a simple surface pattern with a few open positions and fills these positions differently to extract terms and their taxonomic links.

  1. The first step of the method generate more seed terms using the Doubly-Anchored Pattern (DAP)(Kozareva et al., ACL-HLT 2008).
    • DAP: [seed1] such as [seed2] and [X]
    • These terms are then ranked based on how frequently they can be used to find more seed terms.
  2. Then the Doubly-Anchored Pattern is used in the 'backward' direction to find the higher-level seed.
    • Backward DAP: [X] such as [seed1] and [seed2]
    • The higher-level terms are filtered based on if they can be used to find more lower level terms and if they are below the given root concept.

The above two steps are executed alternatively on the snippets returned from Google.


Evaluation

The evaluation was done using the 1000 web snippets returned from Google on two concepts: lions for Animals and Madonna for People. The results were evaluated by human.

Compared with WordNet, the proposed automatic method performed comparable in precision on Animal and better on People. In both cases, the coverage was much better with the proposed method.

Related papers

A similar idea of using the hyponym patterns was used in KnowItAll system (Etzioni et al., JAI 2005).