Difference between revisions of "Hassan et al, ACL 2010"

From Cohen Courses
Jump to navigationJump to search
(Replaced content with '== Citation == Ahmed Hassan, Dragomir R. Radev, Junghoo Cho, Amruta Joshi. 2009. Content Based Recommendation and Summarization in the Blogosphere. The International Confere…')
Line 1: Line 1:
 
== Citation ==
 
== Citation ==
  
Ahmed Hassan, Dragomir R. Radev, Junghoo Cho, Amruta Joshi. 2009. Content Based Recommendation and Summarization in the Blogosphere. The International Conference on Weblogs and Social Media (ICWSM 2009).
+
Ahmed Hassan, Dragomir R. Radev, Junghoo Cho, Amruta Joshi. 2009. Content Based Recommendation and Summarization in the Blogosphere. The International Conference on Weblogs and Social Media (ICWSM 2009).
 
 
== Online version ==
 
 
 
[http://www-personal.umich.edu/~hassanam/my_publications/icwsm09.pdf ICWSM09]
 
 
 
== Summary ==
 
 
 
The aim of this [[Category::paper]] is to find the important and influential blogs with recurring interest in a specific topic. Given a set of blogs related to a particular topic, the authors are trying to find a subset of blogs that represents the larger set.
 
 
 
The authors approach to this [[AddressesProblem::blog retrieval]] problem by using [[UsesMethod::vector space models]].
 
 
 
After some observations, the authors claim that the emotion pattern and word pattern of tweets change as a result of a change in public opinion. With this aim in mind, authors developed an [[UsesDataset::Emotion Corpus (Upinion)]] to detect emotions in tweets.
 
 
Two methods are used to detect opinions
 
 
 
* Vector Space Model : A binary vector has been created for each tweet. Each class of the Emotion Corpus is represented as a dimension in the vector and the value of each dimension is determined by the existence of any emotion word from the related class in the tweet. Centroid of vectors are calculated to represent an interval. [[UsesMethod::Cosine similarity]] is applied to centroid vectors to find the opinion similarity between two intervals.
 
 
 
* Set Space Model : Each time interval is represented by a single document which is the union of tweets posted in that particular time interval. [[UsesMethod::Jaccard similarity]] is used to find the similarity between two intervals.
 
 
 
The authors combine these two methods to detect a change and report a breakpoint.
 
 
 
In addition to detecting these changes, the authors also propose a tf-idf based scoring method to represent the breakpoints. They find the keywords by looking at the tfidf of the words while making sure that a word from the current time do not increase the prominence of the same word from an older time period.
 
 
 
The authors report the analysis of Tiger Wood's car accident topic in 2009. They found several possible breaks within the tweets and some of them are related to the events from reported news. They were also able to produce prominent words that describes the breakpoint.
 
 
 
Related to the paper, the authors produce a [http://upinion.cse.buffalo.edu/beta/index.php news tracking application]on Twitter where a user can click on a period to see the events of the period with related prominent words. 
 
 
 
A related work [[RelatedPaper::Ku et al, AAAI 2006]] also focused on identifiying temporal changes in opinion by using language characteristics of Chinese.
 

Revision as of 23:57, 30 March 2011

Citation

Ahmed Hassan, Dragomir R. Radev, Junghoo Cho, Amruta Joshi. 2009. Content Based Recommendation and Summarization in the Blogosphere. The International Conference on Weblogs and Social Media (ICWSM 2009).