Difference between revisions of "Comparison Andreevskaia et al ICWSM 2007 and MHurst KNigam RetrievingTopicalSentimentsFromOnlineDocumentColeections"

From Cohen Courses
Jump to navigationJump to search
Line 20: Line 20:
  
 
Andreevskaia_et_al perform experiments on binary [positive and negative] and Ternary [positive, negative and neutral ] Sentiment classification over sentences extracted from web blogs in two genres - personal diaries and jornalist blogs. They develop two systems.
 
Andreevskaia_et_al perform experiments on binary [positive and negative] and Ternary [positive, negative and neutral ] Sentiment classification over sentences extracted from web blogs in two genres - personal diaries and jornalist blogs. They develop two systems.
# System1: Using manual sentiment lists from [[UsesDataset::HM word list]] and General Inquirer as seed list they expand it automatically using WordNet. This is done simply by using synonymy, antonymy and hyponymy relations in wordnet.  
+
# System1: Using manual sentiment lists from [[UsesDataset::HM word list]] and General Inquirer as seed list they expand it automatically using WordNet. This is done simply by using synonymy, antonymy and hyponymy relations in wordnet. The words in the expanded list are first searched in wordnet glosses and the sentiment of their head word is assigned to them along with a score which is calculated as
The words in the expanded list are first searched in wordnet glosses and the sentiment of their head word is assigned to them along with a score which is calculated as
+
## Number of times the word was retrieved in multiple system runs with different non-intersecting seed lists  
# Number of times the word was retrieved in multiple system runs with different non-intersecting seed lists  
+
## Consistency of sentiment assigned in these multiple runs
# Consistency of sentiment assigned in these multiple runs
 
 
Overall sentiment score of a sentence was calculated as the sum of the fuzzy scores [normalized to 0-1 using sigmoid transformation] of individual sentiment-bearing words.The scores of positive words were coded as values above zero while scores of negative words as values below zero.
 
Overall sentiment score of a sentence was calculated as the sum of the fuzzy scores [normalized to 0-1 using sigmoid transformation] of individual sentiment-bearing words.The scores of positive words were coded as values above zero while scores of negative words as values below zero.
 
# System2: The previous system was incorporated with a rule based model to handle valence sifters - words that change the polarity of polar words like negations. They use a list of 75 valence shifter words and use a influence span based on valence shifter and the closest punctuation mark.
 
# System2: The previous system was incorporated with a rule based model to handle valence sifters - words that change the polarity of polar words like negations. They use a list of 75 valence shifter words and use a influence span based on valence shifter and the closest punctuation mark.

Revision as of 09:23, 6 November 2012

Papers

  1. All Blogs are Not Made Equal: Exploring Genre Differences in Sentiment Tagging of Blogs, Alina Andreevskaia, Sabine Bergler, and Monica Urseanu, ICWSM 2007
  2. Hurst, Matthew F., and Kamal Nigam. "Retrieving topical sentiments from online document collections." Proceedings of SPIE. Vol. 5296. 2004.

Problem

Andreevskaia_2007 perform sentiment classification(binary and ternary) on a per sentence basis. For their analysis they study the differences between "personal diary" and "journalistic" styled web blogs using a manually annotated data. They evaluate their performance on two systems, a sentiment word counts based system and an improved version using valence shifters.

Hurst_Nigam_2004 had previously performed a similar task of identifying polarity on a per sentence basis to discover polar sentences about a topic. Hurst and Nigam had used a linear classifier ([Winnow_Algorithm]) for topic classification and a rule based grammatical model for polarity identification.

Big Idea

Both the papers try to perform sentiment or polarity classification on a per sentence basis rather than at a document or message level. This is sometimes beneficial for a fine grained identifying of sentiments pertaining to a specific entity or topic. Both the approaches use a more rule based approach by using sentiment word lists for identifying sentiments. While Hurst_et_al use a restricted sentiment word list pertaining to a single topic, Andreevskaia used a much bigger HM word list further expanded using WordNet. Similarly where Hurst_et_al a grammatical approach to assign polarity to topics, Andreevskaia_et_al restricts to sentiment word counts for assigning sentiment labels only.

Method

As briefly mentioned above, both the papers perform sentiment classification at the sentence level.

Hurst_Nigam perform experiments to find out polar sentences regarding a topic. They first train a supervised linear classifier Winnow_Algorithm on messages. During testing they classify each message as being topical or non topical. If a message is found to be topical, each of the sentences are again classified using the same classifier. If a sentence is found to be topical, a rules based polarity classification is run to associate sentiments to topic. The sentiment analysis used in their system uses a manually created sentiment word list (tuned to the domain) and a set of syntactic patterns and grammatical rules like Predicative modification (it is good), Attributive modification (a good car), Equality (it is a good car), Polar clause (it broke my car), Negation Rules like Verbal attachment (it is not good, it isn't good).

Andreevskaia_et_al perform experiments on binary [positive and negative] and Ternary [positive, negative and neutral ] Sentiment classification over sentences extracted from web blogs in two genres - personal diaries and jornalist blogs. They develop two systems.

  1. System1: Using manual sentiment lists from HM word list and General Inquirer as seed list they expand it automatically using WordNet. This is done simply by using synonymy, antonymy and hyponymy relations in wordnet. The words in the expanded list are first searched in wordnet glosses and the sentiment of their head word is assigned to them along with a score which is calculated as
    1. Number of times the word was retrieved in multiple system runs with different non-intersecting seed lists
    2. Consistency of sentiment assigned in these multiple runs

Overall sentiment score of a sentence was calculated as the sum of the fuzzy scores [normalized to 0-1 using sigmoid transformation] of individual sentiment-bearing words.The scores of positive words were coded as values above zero while scores of negative words as values below zero.

  1. System2: The previous system was incorporated with a rule based model to handle valence sifters - words that change the polarity of polar words like negations. They use a list of 75 valence shifter words and use a influence span based on valence shifter and the closest punctuation mark.

Dataset Used

Andreevskaia_et_al tested their system on two datasets

Each dataset contained 600 sentences each, each of which was manually annotated with 200 positive, 200 Negative and 200 Neutral Sentences.

Hurst_Nigam_et_al used a dataset containing 16, 616 sentences from 982 messages extracted from online resources(usenet, online message boards, etc.) about certain domains. Manually annotated 250 Randomly selected sentences with following labels

  • Polarity Identification: positive, negative
  • Topic Identification: Topical, Out-of-Topic
  • Polarity and Topic Identification: positive-correlated, negative-correlated, positive-uncorrelated, negative-uncorrelated.

Other Discussions

  • Similarities: We can clearly see significant similarity between the two approaches. Both work at sentence level, both use manual sentiment word list and a rule based approach to polarity detection. Use of valence shifters and manual annotation in both approaches is seen.
  • Differences: We also can see some differences as well. While hurst et al work on online messages that dealt with reviews, andreevskaia are using web blogs for their study. Hurst Et Al are dealing with both topicality and polarity at a sentence level, while andreevskaia only seem to work on sentiment identification. Andreevskaia_et_al experiments appear to give more insight owing to the fact that they work on a bigger dataset, have better manual annotations (high inter-annotator agreement), more generic sentiment word lists (not domain tuned). Also the automatic expansion of the sentiment word lists using word net makes it more promising.

Other Questions

  1. How much time did you spend reading the (new, non-wikified) paper you summarized? 2 hours 30 Mins
  2. How much time did you spend reading the old wikified paper? 1 hour
  3. How much time did you spend reading the summary of the old paper? 20 min
  4. How much time did you spend reading background material? 1 hour
  5. Was there a study plan for the old paper? Yes
    • If so, did you read any of the items suggested by the study plan? and how much time did you spend with reading them? The wikified paper didn't require much background knowledge. Furthermore it pointed to the "Wordnet" which is well known and the Hurst_Nigam paper referred to above which was anyways being reviewed. As such no further reading was required.
  6. Give us any additional feedback you might have about this assignment.

I think this is a nice way to ensure that previously wikified papers are reviewed by other people while also helping the reader to write new summaries on related papers with similar concepts much faster.