Difference between revisions of "Comparison Andreevskaia et al ICWSM 2007 and MHurst KNigam RetrievingTopicalSentimentsFromOnlineDocumentColeections"

Revision as of 09:16, 6 November 2012

Papers

All Blogs are Not Made Equal: Exploring Genre Differences in Sentiment Tagging of Blogs, Alina Andreevskaia, Sabine Bergler, and Monica Urseanu, ICWSM 2007
Hurst, Matthew F., and Kamal Nigam. "Retrieving topical sentiments from online document collections." Proceedings of SPIE. Vol. 5296. 2004.

Problem

Andreevskaia_2007 perform sentiment classification(binary and ternary) on a per sentence basis. For their analysis they study the differences between "personal diary" and "journalistic" styled web blogs using a manually annotated data. They evaluate their performance on two systems, a sentiment word counts based system and an improved version using valence shifters.

Hurst_Nigam_2004 had previously performed a similar task of identifying polarity on a per sentence basis to discover polar sentences about a topic. Hurst and Nigam had used a linear classifier ([Winnow_Algorithm]) for topic classification and a rule based grammatical model for polarity identification.

Big Idea

Both the papers try to perform sentiment or polarity classification on a per sentence basis rather than at a document or message level. This is sometimes beneficial for a fine grained identifying of sentiments pertaining to a specific entity or topic. Both the approaches use a more rule based approach by using sentiment word lists for identifying sentiments. While Hurst_et_al use a restricted sentiment word list pertaining to a single topic, Andreevskaia used a much bigger HM word list further expanded using WordNet. Similarly where Hurst_et_al a grammatical approach to assign polarity to topics, Andreevskaia_et_al restricts to sentiment word counts for assigning sentiment labels only.

Method

Dataset Used

Andreevskaia_et_al tested their system on two datasets

For journalistic styled blogs they used cyberjournalist.net dataset
For personal journal styled blogs they used 20060501.xml dataset.

Each dataset contained 600 sentences each, each of which was manually annotated with 200 positive, 200 Negative and 200 Neutral Sentences.

Hurst_Nigam_et_al used a dataset containing 16, 616 sentences from 982 messages extracted from online resources(usenet, online message boards, etc.) about certain domains. Manually annotated 250 Randomly selected sentences with following labels

Polarity Identification: positive, negative
Topic Identification: Topical, Out-of-Topic
Polarity and Topic Identification: positive-correlated, negative-correlated, positive-uncorrelated, negative-uncorrelated.

@@ Line 11: / Line 11: @@
 == Big Idea ==
-Both the papers try to perform sentiment or polarity classification on a per sentence basis rather than at a document or message level. This is sometimes beneficial for a fine grained identifying of sentiments pertaining to a specific entity or topic. Both the approaches use a more rule based approach by using sentiment word lists for identifying sentiments. While Hurst_et_al use a restricted sentiment word list pertaining to a single, Andreevskaia use a much bigger [[UsesDataset::HM word list]] further expanded using WordNet. Similarly where Hurst_et_al a grammatical approach to assign polarity to topics, Andreevskaia_et_al restricts to sentiment word counts for assigning sentiment labels only.
+Both the papers try to perform sentiment or polarity classification on a per sentence basis rather than at a document or message level. This is sometimes beneficial for a fine grained identifying of sentiments pertaining to a specific entity or topic. Both the approaches use a more rule based approach by using sentiment word lists for identifying sentiments. While Hurst_et_al use a restricted sentiment word list pertaining to a single topic, Andreevskaia used a much bigger [[UsesDataset::HM word list]] further expanded using WordNet. Similarly where Hurst_et_al a grammatical approach to assign polarity to topics, Andreevskaia_et_al restricts to sentiment word counts for assigning sentiment labels only.
 == Method ==

Difference between revisions of "Comparison Andreevskaia et al ICWSM 2007 and MHurst KNigam RetrievingTopicalSentimentsFromOnlineDocumentColeections"

Revision as of 09:16, 6 November 2012

Contents

Papers

Problem

Big Idea

Method

Dataset Used

Other Discussions

Other Questions

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools