Difference between revisions of "Yue Lu, WWW 2010"

From Cohen Courses
Jump to navigationJump to search
Line 19: Line 19:
  
 
== Summary ==
 
== Summary ==
 +
 +
Automatic review quality prediction can be very useful to sift through spam and bogus reviews in sites like Yelp.com or Amazon.com. Most automatic review quality predictors make use of the review text to predict the review quality. In this paper, the authors describe a method of incorporating social context information in a text-based review quality predictor.
 +
 +
First, the authors give a description of their text-based quality prediction system. This system makes use of the following types features extracted from the text:
 +
* Text-statistics features that are based on aggregate statistics over the text, such as length of the review, average length of sentences, or the richness of the vocabulary.
 +
* Syntactic Features which take into account various statistics related to Part-Of-Speech (POS) of the words in the text such as percentage of nouns, adjectives, punctuations, etc.
 +
* Conformity features that are used in measuring how much a review conforms to the average by looking at the KL-divergence between the unigram language model of a review and the unigram model of an “average” review that contains the text of all reviews for that item i.
 +
* Sentiment features that take into account the presence of positive and negative sentiment of words in the review.
  
 
== Related Papers ==
 
== Related Papers ==
  
 
== Study Plan ==
 
== Study Plan ==

Revision as of 09:04, 2 October 2012

Citation

 author = {Lu, Yue and Tsaparas, Panayiotis and Ntoulas, Alexandros and Polanyi, Livia},
 title = {Exploiting social context for review quality prediction},
 booktitle = {Proceedings of the 19th international conference on World wide web},
 series = {WWW '10},
 year = {2010},
 isbn = {978-1-60558-799-8},
 pages = {691--700},
 numpages = {10},

Online version

http://sifaka.cs.uiuc.edu/~yuelu2/pub/www10-reviewQuality.pdf

Abstract from the paper

Online reviews in which users publish detailed commentary about their experiences and opinions with products, services, or events are extremely valuable to users who rely on them to make informed decisions. However, reviews vary greatly in quality and are constantly increasing in number, therefore, automatic assessment of review helpfulness is of growing importance. Previous work has addressed the problem by treating a review as a stand-alone document, extracting features from the review text, and learning a function based on these features for predicting the review quality. In this work, we exploit contextual information about authors’ identities and social networks for improving review quality prediction. We propose a generic framework for incorporating social context information by adding regularization constraints to the text-based predictor. Our approach can effectively use the social context information available for large quantities of unlabeled reviews. It also has the advantage that the resulting predictor is usable even when social context is unavailable. We validate our framework within a real commerce portal and experimentally demonstrate that using social context information can help improve the accuracy of review quality prediction especially when the available training data is sparse.

Summary

Automatic review quality prediction can be very useful to sift through spam and bogus reviews in sites like Yelp.com or Amazon.com. Most automatic review quality predictors make use of the review text to predict the review quality. In this paper, the authors describe a method of incorporating social context information in a text-based review quality predictor.

First, the authors give a description of their text-based quality prediction system. This system makes use of the following types features extracted from the text:

  • Text-statistics features that are based on aggregate statistics over the text, such as length of the review, average length of sentences, or the richness of the vocabulary.
  • Syntactic Features which take into account various statistics related to Part-Of-Speech (POS) of the words in the text such as percentage of nouns, adjectives, punctuations, etc.
  • Conformity features that are used in measuring how much a review conforms to the average by looking at the KL-divergence between the unigram language model of a review and the unigram model of an “average” review that contains the text of all reviews for that item i.
  • Sentiment features that take into account the presence of positive and negative sentiment of words in the review.

Related Papers

Study Plan