Miller et al ICWSM 2011

From Cohen Courses
Revision as of 05:45, 27 September 2012 by Anikag (talk | contribs) (→‎Dataset)
Jump to navigationJump to search

Citation

author    = {Mahalia Miller and
              Conal Sathi and
              Daniel Wiesenthal and
              Jure Leskovec and
              Christopher Potts},
 title     = {Sentiment Flow Through Hyperlink Networks},
 booktitle = {ICWSM},
 year      = {2011},
 ee        = {http://www.aaai.org/ocs/index.php/ICWSM/ICWSM11/paper/view/2883},
 crossref  = {DBLP:conf/icwsm/2011},
 bibsource = {DBLP, http://dblp.uni-trier.de}

Online Version

http://cs.stanford.edu/people/jure/pubs/sentiflow-icwsm11.pdf

Main Idea

This paper combines the work done in sentiment analysis of text and graph analysis in order to study the flow of sentiments through a network of blog posts connected by hyperlinks.

Dataset

The data has been obtained from the MemeTracker Project for the month of August 2010. The dataset consists of roughly 1 million blog posts per day. Each post consists of a URL, time stamp, full text of the post and the list of URLs to the posts it cites. The data has pruned to remove singleton posts ( posts which do not link to any other posts). The links to self posts and to the posts outside the data has been removed in order to focus on the flow of sentiments within the network. The dataset used has aprroximately 8 million blog posts and 15 million hyperlinked edges.

Methodology

- Sentiment Extraction The documents has been treated as a bag-of-word model. Harvard Inquirer and SentiWordNet has been used to obtain the sentiment scores of the individual words in the post. The sentiment attributes are - positivity, negativity and objectivity of a post. The result of the analysis. The paper proposes sentiment extraction from emoticon. The authors define the average sentiment of a user as the baseline




Study Plain

- Harvard Inquirer