SentiWordNet: A Publicly Available Lexical Resource for Opinion Mining

Citation

Andrea Esuli , Fabrizio Sebastiani, "SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining". In Proceedings of the 5th Conference on Language Resources and Evaluation (LREC’06), Pg. 417-422.

Online version

LREC 2006, SentiWordNet: A Publicly Available Lexical Resource for Opinion Mining

Summary

This paper discusses the development of SentiWordNet, a lexical resource in which each WordNet synset s is associated to three numerical scores Obj(s), Pos(s), Neg(s) used to describe how objective, positive and negative the terms contained in the synset are.

The motivation behind this research is to aid Opinion mining by providing an off the shelf lexical resource that provides a granular level of opinion tags for a large set of words.

The development method of SentiWordNet is an adaptation of PN-polarity [Esuli and Sebastiani, 2005] and SO-polarity [Esuli and Sebastiani, 2006] identification methods. The proposed method uses a set of ternary classifiers, capable of deciding whether a synset is Positive, Negative or Objective. Each ternary classifier differs from other in two perspectives, first the training data used, secondly, the learner. Thus each ternary classifier produces different classification results for a synset. The final opinion score is calculated using the normalization of scores from all the classifiers.

Background

In Opinion mining there are mainly three tasks related to tagging the given text with expressed opinion:

Determining text SO-polarity by checking whether the text has factual nature or expresses opinion.Pang and Lee,2004 Hatzivassiloglou,2003
Determining text PN-polarity by checking whether text expresses a positive or negative opinion on subject matter.Pang and Lee,2004 Turney,_ACL_2002
Determining the strength of text PN-polarity by checking the expressed opinion's emphasis (Weak, Mild, Strong).Pang and Lee,2005 Wilson et al.,2004

Method

Training Data

A small subset L $\subset$ $T_{r}$ of the training data $T_{r}$ is manually labeled. The labeled data L is union of $L_{o}$ objective synsets data, $L_{p}$ positive synsets data and $L_{n}$ negative synsets data. $L_{p}$ and $L_{n}$ are used as seed datasets to iteratively add related synsets to the new training sets and so on. The final training sets $T_{r}$ $_{p}^{K}$ , $T_{r}$ $_{n}^{K}$ are obtained after k iterations. The iteration strategy uses the relationship between synsets to navigate through synsets in Wordnet. It adds the new synsets to respective training set depending on the relationship property. The new synset is added to the same training set if relationship has preserving property and to the opposite training set if relationship has invert property. The author have used direct antonymy, similarity, derived from, pertains-to,attribute and also-see relationships for expanding the seed dataset, from Valitutti et al. 2004.

The $L_{o}$ set is collected by two approaches. First, collect all the synsets that don't belong to either to $T_{r}$ $_{p}^{K}$ or $T_{r}$ $_{n}^{K}$ . Secondly, collect synsets that contain terms not marked as either positive or negative charachteristics in general Inquirer Lexicon Stone et al., 1966.

Data Representation

Each synset is represented by vector space model preceded by stop word removal to its gloss (brief description string). The assumption here is that terms with similar polarity tend to have similar glosses. Therefore gloss words are good feature for training the classifier.

Ternary Classification Model

The vectorial representations of the training synsets for a given label $c_{i}$ are then input to a standard supervised learner which generates two classifiers. The first classifier learns to distinguish between positive and non-positive terms. The second classifier learns to distinguish between negative and non-negative terms. A new term is positive if the first classifier classifies it as positive and second classifier classifies it as non-negative. The terms that are classified (i) into both positive and negative (ii) into both non-positive and non-negative are takent to be objective. In the training phase the terms $T_{r}$ $_{n}^{K}$ $\cup$ $T_{r}$ $_{o}^{K}$ are used as training examples of category not-positive and the terms in $T_{r}$ $_{p}^{K}$ $\cup$ $T_{r}$ $_{o}^{K}$ are used as training examples for category not-negative. The resulting ternary classifier is then applied to the vectorial representations of all Wordnet synsets including ( $T_{r}$ $^{K}$ -L) to produce the sentiment classification of the entire Wordnet.

Classifier Committee/Set

In Ensuli and Sebastini, 2006 paper the authors have experimented on polarity classifiers with different training data and learner configurations. Some of the important observations that we need to consider are as follows:

Lower number of iteratinos on the seed datasets produces small training sets $T_{r}$ $_{p}^{K}$ and $T_{r}$ $_{n}^{K}$ . This results in low recall and high precision. By increasing the K these sets get larger with the effective increased recall and reduced precision.
Learns that use information about the prior probabilities of categories such as Naive bayes, SVM are sensitive to relative cardinalities of training sets. Learners like Rocchio don't exhibit this kind of behavior.
The above two variations doesn't affect overall accuracy but only the balance in classification between subjective and objective items, while the accuracy in discriminating the positive and neagtive items tends to be constant.

Using these observations four different sizes of training data are generated by varying k ( 0,2,4,6) and alternatively two learners ( Rocchio and SVM) are used with each dataset. This yields 8 ternary classifiers.

Results

SVM learner with k=0 training dataset has low recall and high precision. SVM learner with K=6 training dataset has high recall and low precision.
Rochio has very similar behaviour to SVM even though it doesn't depend on prior probabilities of categories.
Approximately 24% of the synsets have some degree of opinion related properties.
As the objectivity score decreases (subjectivity score increases) the number of synsets involved descreases rapidly from 10.45% for Objectivity<=0.5 to 0.56% for Objectivity <=0.125. This indicates that there are only few terms that are unquestionably(majority of classifiers vote the same) positive.
The adverb and adjective synsets are evaluated as subjective(obj(s)<1) much more frequently then verb. This indicates that in natural language processing opnionated content is most often carried by parts of speech used as modifiers (adjective,adverb) rather than parts of speech used as head(verbs, nouns).

Evaluation

The author suggests that it is experimentally impossible to measure the accuracy. Since we don't have significant amount of labeled test data.

Still following two approaches are basic steps:

The General Inquirer Lexicon,Stone et al.,1966, tags are compared with SentiWordNet tags to calculate the accuracy. The results are posted in Esuli and Sebastiani 2006.
Second approach is to create hand labeled test dataset of 1000 synsets and then compare the manual tags and SentiWordNet tags. This is still under work. It is important to note that the test data of 1000 synsets is very less as compared to total 115000 synsets in wordnet. So this approach is just an indicative of level of accuracy of SentiWordNet.

Visualization

The given synset's scores can be visualized in a triangular format. Since for a given synset s the scores (Objective(s)+Positive(s)+Negative(s)) sum to 1. Therefore the score for all three tags can fit in a triangle where the corners have the highest score 1.0 for one tag and 0.0 for the other two. The center of the triangle has 0.33 for all the three tags(Objective, Positive, Negative). We can use PN-polarity and SO-polarity axises to interpret (i) Subjectivity and Objectivity (ii) Positive and Negative strength/scores for given synset.

Discussion

There has been considerable work in identifying whether a term has Positive or Negative connotation. But there has been very less research done in identifying whether a text is opinionated or not (Subjective or Objective). This paper provides a combined approach to these two different problems.
Esuli and Sebastiani,2006, Riloff et al.,2003,Vegnaduzzo,2004 papers classify the terms not the senses of words. In contrast to that
1. The authors switch classification from term based approach to synsets. Assuming that different senses of a term have different opinion-related properties. Each opinion score ranges from 0.0 to 1.0 and their sum is 1.0 for each synset. This means that a synset can have more than one opinion score as compared to previous approach where only one opinion was hardwired to a term. Similar intuition was presented by Kim and Hovy,2004,Tuney and Littman,2003,Andreevskaia and Bergler,2006 but all of them considered the score as confidence of correctness in labeling where as in this paper the score is interpreted as the strength of the deemed property.
The usage of glosses associated to synsets for classification is a novel approach.

Related Papers

And Fabrizio Sebastiani , Andrea Esuli , Fabrizio Sebastiani, Determining Term Subjectivity and Term Orientation for Opinion Mining Andrea Esuli (2006),In Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL’06).
- This paper is prerequisite to understand author's previous work on term subjectivity identification and term orientation identification. The current paper is adaptation of methods discussed in this paper.
Bo Pang and Lillian Lee. 2004. A sentimental education: Sentiment analysis using subjectivity summarization based on min-imum cuts. In Proceedings of ACL-04, 42nd Meeting of the Association for Computational Linguistics, pages 271–278,Barcelona, ES.
- It describes the Subjectivity, Objectivity identification of text.
- It describes classification of text into Positive or Negative sentiment.
Vasileios Hatzivassiloglou and Kathleen R. McKeown. 1997. Predicting the semantic orientation of adjectives. InProceed-ings of ACL-97, 35th Annual Meeting of the Association for Computational Linguistics, pages 174–181, Madrid, ES.
- It describes early work in using sentiment word lists to classify text into Positive or Negative sentiment.
Alessandro Valitutti, Carlo Strapparava, and Oliviero Stock. 2004. Developing affective lexical resources. PsychNology Journal, 2(1):61–83.
P. J. Stone, D. C. Dunphy, M. S. Smith, and D. M. Ogilvie. 1966. The General Inquirer: A Computer Approach to Content Analysis. MIT Press, Cambridge, US.
- Describes the general inquirer lexicon used to evaluate the results in SentiWordNet.

Study Plan

WordNet And synset
- Learn about wordnet and synset.
vector space model
- Know how to create vector representation of a text document.
Esuli and Sebastiani, 2006
- Read about author's previous work on term subjectivity identification and term orientation identification.
[Valitutti et al. 2004]
- Learn about the synset relationships and navigation techniques.
Rocchio
SVM
General enquirer lexicon