Danescu-Niculescu-Mizil et. al., WWW 2009

From Cohen Courses
Jump to navigationJump to search

Citation

 author = {Danescu-Niculescu-Mizil, Cristian and Kossinets, Gueorgi and Kleinberg, Jon and Lee, Lillian},
 title = {How opinions are received by online communities: a case study on amazon.com helpfulness votes},
 booktitle = {Proceedings of the 18th international conference on World wide web},
 series = {WWW '09},
 year = {2009},
 isbn = {978-1-60558-487-4},
 pages = {141--150},
 numpages = {10},

Online version

http://www.cs.cornell.edu/home/kleinber/www09-helpfulness.pdf

Abstract from the paper

There are many on-line settings in which users publicly express opinions. A number of these offer mechanisms for other users to evaluate these opinions; a canonical example is Amazon.com, where reviews come with annotations like "26 of 32 people found the following review helpful." Opinion evaluation appears in many off-line settings as well, including market research and political campaigns. Reasoning about the evaluation of an opinion is fundamentally different from reasoning about the opinion itself: rather than asking, "What did Y think of X?", we are asking, "What did Z think of Y's opinion of X?" Here we develop a framework for analyzing and modeling opinion evaluation, using a large-scale collection of Amazon book reviews as a dataset. We find that the perceived helpfulness of a review depends not just on its content but also but also in subtle ways on how the expressed evaluation relates to other evaluations of the same product. As part of our approach, we develop novel methods that take advantage of the phenomenon of review "plagiarism" to control for the effects of text in opinion evaluation, and we provide a simple and natural mathematical model consistent with our findings. Our analysis also allows us to distinguish among the predictions of competing theories from sociology and social psychology, and to discover unexpected differences in the collective opinion-evaluation behavior of user populations from different countries.

Summary

In this paper, the authors try to understand and model how opinions are evaluated within online communities. For example, on Amazon.com, the users not only write product reviews, but also provide us with an indication of the helpfulness of the reviews ("26 of 32 people found the following review helpful"). Previous work have shown that helpfulness votes of reviews on Amazon.com are not necessarily strongly correlated with certain measures of review quality. Rather, various complex social feedback mechanisms tend to affect how Amazon users evaluate each others' reviews in practice. The authors explored 4 different classes of theories that explain how social effects influence a group's reaction to an opinion:

  • The conformity hypothesis: a review is evaluated as helpful when its star rating is closer to the consensus (or average) star rating for the product.
  • The individual-bias hypothesis: a user will rate a review more highly if it expresses an opinion that he or she agrees with. Notice that if a diverse range of users apply this rule, then the overall helpfulness evaluation would be hard to distinguish from one based on conformity.
  • The brilliant-but-cruel hypothesis: negative reviewers are perceived as more intelligent, competent and expert than positive reviewers.
  • The quality-only straw-man hypothesis: the helpfulness of the review is being evaluated, purely based on the textual content of the reviews.

The authors try to investigate how data on star ratings and helpfulness votes can support or contradict these hypotheses, using a dataset consisting of over four million reviews of roughly 675,000 books on Amazon's US site as well as smaller but comparably sized corpora from Amazon's UK, Germany and Japan sites.

Deviation from the mean:

The helpfulness ratio of a review is the fraction of evaluators who found it helpful. The product average for a review of a given product is the average star rating given by all reviews of the product. The authors found that the median helpfulness ratio of reviews decreases monotonically as a function of the absolute difference between their star rating and the product average. This finding seems to be consistent with the conformity hypotheses but a closer look raises some issues. Looking at the signed difference, which is positive or negative depending on whether the star rating is above or below average, the authors found that not only does the median helpfulness as a function of signed difference fall away on both sides of 0, it does so asymmetrically: slightly negative reviews are punished more strongly, with respect to helpfulness evaluation, than slightly positive reviews. Not only does this disprove the brilliant-but-cruel hypothesis, this finding is at odds with the conformity hypothesis in its pure form because closeness to the average is not being equally rewarded; overly positive ones are rewarded more.

Variance and individual bias:

To explain the above phenomena, the authors associated with each product the variance of the star ratings assigned to it by all its reviews. They then grouped products by variance, and performed the signed-difference analysis on sets of products having fixed levels of variance. They found that:

• When the variance is very low, the reviews with the highest helpfulness ratios are those with the average star rating. • With moderate values of the variance, the reviews evaluated as most helpful are those that are slightly above the average star rating. • As the variance becomes large, reviews with star ratings both above and below the average are evaluated as more helpful than those that have the average star rating (with the positive reviews still deemed somewhat more helpful)

Related Papers

Study Plan