Tsur et al ICWSM 10

From Cohen Courses
Revision as of 18:28, 30 September 2012 by Epapalex (talk | contribs)
Jump to navigationJump to search

This a Paper that appeared at the International AAAI Conference on Weblogs and Social Media 2010

Citation

 title={ICWSM--A great catchy name: Semi-supervised recognition of sarcastic sentences in online product reviews},
 author={Tsur, O. and Davidov, D. and Rappoport, A.},
 booktitle={Proceedings of the fourth international AAAI conference on weblogs and social media},
 pages={162--169},
 year={2010}

Online version

ICWSM–A great catchy name: Semi-supervised recognition of sarcastic sentences in online product reviews

Summary

In this work, the authors introduce a novel semi-supervised approach that is able to identify sarcasm in the comments of online reviews. In order to do that, they first define a small training set, labeled by hand, which contains some very obvious sarcastic comments and some clearly non-sarcastic ones. The sarcasm levels for each of those reviews range in a scale from 1-5. Using this train set, they extract two different types of features:

  • Pattern Based: For the pattern identification, the authors separated all terms into High Frequency Words (HFW) or Context Words (CW), simply by thresholding their corpus frequency (with HFW having higher such frequency than CW's). Consequently, they allow for each pattern to contain 2-6 HWF and 1-6 CW. As a next step, they filter out some patterns that are not particularly useful (in order to cut down their initially big number), by eliminating patterns that 1) appear only on a single product, 2) appear on the train set in reviews which are either clearly sarcastic (rated 5) or clearly non-sarcastic (rated 1).
  • Syntatctic

After the feature extraction process, in order to decide how sarcastic a new comment, drawn from a test dataset, is, they utilize a k-NN inspired classifier which works as follows:


Evaluation

Datasets:

Metrics:

Baselines:

Results:


Related Papers

Study Plan