Difference between revisions of "Comparison Andreevskaia et al ICWSM 2007 and MHurst KNigam RetrievingTopicalSentimentsFromOnlineDocumentColeections"

From Cohen Courses
Jump to navigationJump to search
Line 10: Line 10:
  
 
== Big Idea ==
 
== Big Idea ==
 +
 +
Both the papers try to perform sentiment or polarity classification on a per sentence basis rather than at a document or message level. This is sometimes beneficial for a fine grained identifying of sentiments pertaining to a specific entity or topic. Both the approaches use a more rule based approach by using sentiment word lists for identifying sentiments. While Hurst_et_al use a restricted sentiment word list pertaining to a single, Andreevskaia use a much bigger [[UsesDataset::HM word list]] further expanded using WordNet. Similarly where Hurst_et_al a grammatical approach to assign polarity to topics, Andreevskaia_et_al restricts to sentiment word counts for assigning sentiment labels only.
  
 
== Method ==
 
== Method ==

Revision as of 08:05, 6 November 2012

Papers

  1. All Blogs are Not Made Equal: Exploring Genre Differences in Sentiment Tagging of Blogs, Alina Andreevskaia, Sabine Bergler, and Monica Urseanu, ICWSM 2007
  2. Hurst, Matthew F., and Kamal Nigam. "Retrieving topical sentiments from online document collections." Proceedings of SPIE. Vol. 5296. 2004.

Problem

Andreevskaia_2007 perform sentiment classification(binary and ternary) on a per sentence basis. For their analysis they study the differences between "personal diary" and "journalistic" styled web blogs using a manually annotated data. They evaluate their performance on two systems, a sentiment word counts based system and an improved version using valence shifters.

Hurst_Nigam_2004 had previously performed a similar task of identifying polarity on a per sentence basis to discover polar sentences about a topic. Hurst and Nigam had used a linear classifier ([Winnow_Algorithm]) for topic classification and a rule based grammatical model for polarity identification.

Big Idea

Both the papers try to perform sentiment or polarity classification on a per sentence basis rather than at a document or message level. This is sometimes beneficial for a fine grained identifying of sentiments pertaining to a specific entity or topic. Both the approaches use a more rule based approach by using sentiment word lists for identifying sentiments. While Hurst_et_al use a restricted sentiment word list pertaining to a single, Andreevskaia use a much bigger HM word list further expanded using WordNet. Similarly where Hurst_et_al a grammatical approach to assign polarity to topics, Andreevskaia_et_al restricts to sentiment word counts for assigning sentiment labels only.

Method

Dataset Used

Other Discussions

Other Questions