Metaphor Detection in Different Topics

Comments

Looks like a nice idea for a project - it will be interesting to see how this turns out.

One worry I have is evaluation - is there any plan to a quantitative evaluation? One idea might be to consider some prediction task (I don't know - comment volume? comment sentiment? there is some existing data on that) and see if the metaphor usage is a useful predictor. --Wcohen 15:35, 10 October 2012 (UTC)

Team Members

Project Abstract

There is a rising interest towards metaphor detection. Specifically, detecting the violation of "selectional preference" (mostly of verbs) is the most well-known approach. The idea of "selectional preference" is that verbs have semantic preferences of their arguments. For instance, the verb "flex" has a strong preference of "muscle" and "bone" as its object. If we find that in some text, the object of "flex" is not of the semantic class of "muscle" and "bone", and it's very likely to be a metaphor.

The big idea of our project is that to observe the selectional preference of the same verbs among different topics. For instance, in the topic of sport, the subjects of "flex" are mostly humans; but in the topic of finance or politics, the subjects of "flex" are mostly organizations or countries, e.g., "China to flex its financial muscles at US meeting." We're interested in this difference, and aim to observe how could the metaphor detection technique be affected.

Data

We aim to find some more "vivid" metaphors, so we plan to use blog corpora rather than newspaper corpora. The two possible options are as follows.

The Blog Authorship Corpus
The Blog Authorship Corpus consists of the collected posts of 19,320 bloggers gathered from blogger.com in August 2004. The corpus incorporates a total of 681,288 posts and over 140 million words - or approximately 35 posts and 7250 words per person.

Political Blog Corpora

Considering the size of corpus, we might start from The Blog Authorship Corpus first.

Techniques

Violation Detection of Selectional Preference
There are some resources can be used to detect selectional preference violation. One of them is to use the VerbNet. VerbNet has some information about the constraint of arguments of verbs. By matching the text with verb and its argument, we're able to detect the violation of arguments.

Topic Modeling
We want to use LDA to model the topics of blog post. By topic modeling, we want to observe the changes of selectional preferences among various topics.

Word Clustering
In some literature of metaphor detection like (Shutova et al., 2010), due to the data sparsity, they first build the semantic clusters of nouns and verbs, and then analyze the selection preference of "verb clusters" (rather than "verbs") toward "noun clusters" (rather than "nouns"). This approach seems quite reasonable for us, so we might also adopt this method.

Related Work

Birte Loenneker-Rodman and Srini Narayanan (2012). Computational Models of Figurative Language, Cambridge Encyclopedia of Psycholinguistics (2012). Spivey, M., Joannisse, M., McRae, K. (eds.), Cambridge Univeristy Press, Cambridge. http://www1.icsi.berkeley.edu/~snarayan/CompFig.pdf
Ekaterina Shutova, Lin Sun, and Anna Korhonen (2010). Metaphor identification using verb and noun clustering. COLING 2010. http://dl.acm.org/citation.cfm?id=1873894
Ekaterina Shutova. (2010). Models of metaphor in NLP. ACL 2010. http://dl.acm.org/citation.cfm?id=1858752

Metaphor Detection in Different Topics

Contents

Comments

Team Members

Project Abstract

Data

Techniques

Related Work

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools