Do metaphors shape a listener's thinking?

From Cohen Courses
Jump to navigationJump to search

Comments

  • Interesting and well-defined task. I am curious if the topic or theme of a talk can affect what you might classify as literal or a figurative language? E.g., In a medical domain "incisive" as in incisive foramen is literal, otherwise means sharp, acute remark. Does your approach account for that?
    • No, my algorithm will not use topic models. I added the ‘Algorithm’ section to clarify the approach --Ytsvetko


  • Please detail on how do you plan to sample the test-set.
    • I extended the evaluation section --Ytsvetko

--Apappu 12:22, 11 October 2012 (UTC)

Team members

Project Summary

Metaphors are very powerful communication tools: they help deliver complex concepts and ideas simply and effectively, they can shape listener’s opinion. We hypothesize that the semantic orientation of listener’s comments can be predicted by the amount of the figurative language in a lecture: the more metaphors the speaker uses, the more support he will get from the audience. This project will explore the relation between the amount of metaphors in text and the polarity of listeners feedback.

Dataset

The project will use the TED talks dataset, which contains 667 talks with 19752 comments from active users.

Task

Develop unsupervised, language independent approach to cluster sentences into literal and figurative. Sample a test set of lectures, and manually annotate user comments for each lecture into three categories: positive, negative and neutral. Measure correlation between literal/figurative usages and positive/negative comments.

Algorithm

The algorithm will first learn the initial set of selectional restrictions. Then, it will induce about literal/figurative words usages by learning if a word combination breaks or does not break the selectional restrictions. Then, the bootstrapping algorithm will iteratively refine the selectional restrictions rules and literal/figurative split.

Our main assumption is that in large corpus the literal usage of a word is more frequent than its non-literal usage . Therefore, in the Subject-Verb and Verb-Object pairs (we will look only on these constructions) the most frequent combinations of subject/object WordNet (WN) domains and verb roles from the VerbNet (VN) do not break selectional restrictions. We will construct the initial split to combinations that violate or do not violate selectional restrictions using GMM clustering of tuples. Each tuple will contain collocation measures of WN domain and VN role combinations for the candidate pairs. We assume that legal domain combinations have small variance, vs. irregular domain combinations that consist of outliers, and have large variance. Given this initial set of domain combinations that do not violate selectional restrictions we can detect literal and figurative usages of words.

Evaluation

We will compare our metaphor detection results to the Birke and Sarkar’s (2006) results on the subset of twenty-five verbs that appear in 1,965 sentences, manually labeled L (literal) or N (nonliteral), according to the sense of the target verb. Labeled verbs are publicly available in the TroFi (Trope Finder) Example Base. The baseline f-score obtained by Birke and Sarkar’s (2006) active learning approach is 64.9%.

To evaluate the assumption that there is correlation between the amount of figurative usages and user comments polarity we will rank TED talks by the amount of figurative language in each talk, and will manually annotate positive and negative comments for talks that contain the largest or the smallest amounts of figurative language. Also, we will randomly sample N talks and will manually evaluate them in a similar way.

Related work

C. Sporleder and L. Li. 2009. Unsupervised Recognition of Literal and Non-Literal Use of Idiomatic Expressions. In Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009), pages 754–762. Association for Computational Linguistics. pdf, summary.