Ramnath Balasubramanyan et. al. ICWSM 2012

From Cohen Courses
Jump to navigationJump to search

Citation

R Balasubramanyan, WW Cohen, D Pierce.et.al Modeling Polarizing Topics: When Do Different Political Communities Respond Differently to the Same News?

Summary

In this paper the authors addresses the problem "members of a different political communities will emotionally respond to the same news story." using machine learning techniques.

Detect comments sentiment

They use two different approaches to measure comments sentiment.

1.Using SentiWordNet to automatically compute a positive score and a negative score using words from a blog, if the negative score is bigger than positive, then the sentiment is negative and vice versa.

2.Using PMI. The first choose 100 top most positive words and 100 top most negative from SentiWordNet as seed lists. Then they compute PMI of each words in the vocabular they compute for each blog to select 1000 most positive and negative words respectively. Then the comment sentiment is judged by the counts of negative verus positive words.

In their experiment, PMI has better accuracy than SentiWordNet, evaluating by manual labeling.

Single community comment sentiment prediction

They experiment two models to predict comments sentiment of a blogpost, within the same blog, SVM and sLDA. And found sLDA has a better performance.

Multi-community comment sentiment prediction

In order to measure they comment sentiment towards the same topic from different communities, they first combines posts from different blogs. Then the extract topics from the combined corpus.

They then use sLDA to train models for two different communities, one for liberal another for conservative, based on the topics get from the combined corpus.

Each community has two regression models, one to predict sentiment, one to predict comments volume. So in all, there are 4 different regression models, all of their attributes are topics from a shared set of topics. The only difference between them is the coefficients of those topics.

Discovery

By comparing the coefficient of different regression model on a same attribute, a topic in this case, we can find the most controversial topics and those are not.

The author inspect the results and find that the most contentious topics include topics like Energy and environment, Union and women's rights, Senate procedures. The most non-polarized topics include Econnomy and taxes, Mid-term elections and so on.

The authors also discovered that sentiment co-efficients also has correlated with volume co-efficients. Which means positive sentiment tends to attract more comments from the readers the blog.