Difference between revisions of "Miller et al ICWSM 2011"
(Created page with '== Citation == author = {Mahalia Miller and Conal Sathi and Daniel Wiesenthal and Jure Leskovec and Christopher Po…') |
|||
Line 11: | Line 11: | ||
crossref = {DBLP:conf/icwsm/2011}, | crossref = {DBLP:conf/icwsm/2011}, | ||
bibsource = {DBLP, http://dblp.uni-trier.de} | bibsource = {DBLP, http://dblp.uni-trier.de} | ||
+ | |||
+ | == Online Version == | ||
+ | http://cs.stanford.edu/people/jure/pubs/sentiflow-icwsm11.pdf | ||
+ | |||
+ | == Main Idea == | ||
+ | This paper combines the work done in sentiment analysis of text and graph analysis in order to study the flow of sentiments through a network of blog posts connected by hyperlinks. | ||
+ | |||
+ | == Dataset == | ||
+ | The data has been obtained from the [http://memetracker.org/ MemeTracker Project] for the month of August 2010. The dataset consists of roughly 1 million blog posts per day. Each post consists of a URL, time stamp, full text of the post and the list of URLs to the posts it cites. | ||
+ | The data has pruned to remove singleton posts ( posts which do not link to any other posts). The links to self posts and to the posts outside the data has been removed in order to focus on the flow of sentiments within the network. The dataset used has aprroximately 8 million blog posts and 15 million hyperlinked edges. |
Revision as of 04:10, 27 September 2012
Contents
Citation
author = {Mahalia Miller and Conal Sathi and Daniel Wiesenthal and Jure Leskovec and Christopher Potts}, title = {Sentiment Flow Through Hyperlink Networks}, booktitle = {ICWSM}, year = {2011}, ee = {http://www.aaai.org/ocs/index.php/ICWSM/ICWSM11/paper/view/2883}, crossref = {DBLP:conf/icwsm/2011}, bibsource = {DBLP, http://dblp.uni-trier.de}
Online Version
http://cs.stanford.edu/people/jure/pubs/sentiflow-icwsm11.pdf
Main Idea
This paper combines the work done in sentiment analysis of text and graph analysis in order to study the flow of sentiments through a network of blog posts connected by hyperlinks.
Dataset
The data has been obtained from the MemeTracker Project for the month of August 2010. The dataset consists of roughly 1 million blog posts per day. Each post consists of a URL, time stamp, full text of the post and the list of URLs to the posts it cites. The data has pruned to remove singleton posts ( posts which do not link to any other posts). The links to self posts and to the posts outside the data has been removed in order to focus on the flow of sentiments within the network. The dataset used has aprroximately 8 million blog posts and 15 million hyperlinked edges.