Metaphor identification using verb and noun clustering
From Cohen Courses
Citation
Ekaterina Shutova, Lin Sun, and Anna Korhonen (2010). Metaphor identification using verb and noun clustering. COLING 2010.
Online Version
http://dl.acm.org/citation.cfm?id=1873894
Summary
- Basic Idea: The basic idea of this paper is to expand the metaphorical Subject-Verb or Verb-Object pairs, e.g., (nation, flex, nsbj), or (burn, money, dobj), and use the expanded pairs to find the new metaphor in the open text. In this paper, they built clusters of verb and noun respectively, and expand the S-V/V-O pairs based on those clusters.
- Metaphor Seed: Shutova (2010) annotated metaphorical expressions in a subset of the British National Corpus (BNC), and they took this as the seed metaphor.
- Feature for Verb Clustering: They adopted the automatically acquired verb subcategorization frames (SCFs) parameterized by their selectional preferences (SPs). They obtained these features using the SCF acquisition system of Preiss et al.(2007).
- Features for Noun Clusiering: They used grammatical relations as features for noun clustering, i.e., they employed all the argument heads and verb lemmas appearing in the subject, direct object and indirect object relations in the RASP-parsed BNC.
- Clustering Algorithm: spectral clustering (SPEC).
- Selectional Preference Strength Filter: They adopted the method proposed by Resnik (1997) to measure the strength of selectional preferences of verbs based on noun clusters, and filter out the verbs with weak selectional preference, e.g., "take", "put", "get".
- Achieve precision of 0.79. They search for metaphors in BNC, and hired human annotators to evaluate the result. The precision is 0.79, while the baseline method (expand the seed based on WordNet) only attains 0.44.
Discussions
- Good clustering seems quite important. They use the features developed based on selectional preferences of verb to build the verb clusters, and also used syntactic features to build the noun cluster. These features might be quite critical to make good clusters and thus achieve good metaphor detection. However, their features and systems are not openly accessible and seems not very easy to reproduce.
- This method requires hand-labeled seed. This process is semi-automatic, and human-pick seeds would have some issues about the coverage. If we want a fully automatic process, the seeds have to be generated automatically.
- How about the recall? Though they have no data to evaluate the recall, this kind of approaches sometimes come with a low recall and a high precision. That's a hidden issue of the method.
Reference
- Resnik, P. 1997. Selectional preference and sense disambiguation. In ACL SIGLEX Workshop on Tagging Text with Lexical Semantics, Washington, D.C.
- Shutova, E. and S. Teufel. 2010. Metaphor corpus annotated for source - target domain mappings. In Proceedings of LREC 2010, Malta.
- Preiss, J., T. Briscoe, and A. Korhonen. 2007. A system for large-scale acquisition of verbal, nominal and adjectival subcategorization frames from corpora. In Proceedings of ACL-2007, volume 45, page 912.