The viability of web-derived Polarity Lexicons

From Cohen Courses
Jump to navigationJump to search

This a Paper reviewed for Social Media Analysis 10-802 in Fall 2012.

Citation

 author    = {Leonid Velikovich and
              Sasha Blair-Goldensohn and
              Kerry Hannan and
              Ryan T. McDonald},
 title     = {The viability of web-derived polarity lexicons},
 booktitle = {HLT-NAACL},
 year      = {2010},
 pages     = {777-785},
 ee        = {http://www.aclweb.org/anthology/N10-1119},
 bibsource = {DBLP, http://dblp.uni-trier.de}

Online version

The viability of web-derived polarity lexicons

Summary

The authors examine the viability of building large polarity lexicons semi-automatically from the web. They describe a graph propagation approach to build an English lexicon without making use of language dependent resources like Wordnet, POS taggers, etc, as with previous approaches to sentiment analysis. As such the lexicons proposed are not limited to specific word classes and also contain slang, misspellings, multiword expressions, etc. They report a qualitative and quantitative evaluation of the derived lexicons and show superior performances to previously studied lexicons on the sentence polarity classification task.

Approach

Polarity lexicons are large lists of phrases that encode the polarity of each phrase either positive or negative often with some score to represent magnitude of polarity. The authors propose a graph propagation approach inspired by previous work on constructing polarity lexicons from lexical graphs but without using linguistic resources like Wordnet. Instead the graph is built using co-ocurrance statistics from the entire web.

The algorithm is different from common graph propagation algorithms like label propagation. It produces a output polarity vector with Polarity score of candidate phrase. Algorithm computes both positive and negative polarity score for each node in the graph. These are the equal to the sum over the max weighted path from every seed word[positive or negative] to node . Final polarity of a phrase is where is a constant to account for overall mass of positive and negative flow in the graph. The algorithm is iterative and considers paths of increasing length at each iteration. Input variable T controls the max path length considered. Parameter defines the minimum polarity magnitude a phrase must have to be included in the lexicon.

Graph propogation fig.jpg

Weighted graph G = (V,E);

V represent the nodes and E represent the edges. is weight of edge

P represent positive seed nodes and N represent negative seed nodes.

  • Building the Phrase Graph from web:

For this study the authors used an English graph where th node set V was based on all n-grams up to the length 10 extracted from 4 Billion web pages, filtered to 20 million candidates via heuristics. Context vector for each phrase based on the window of size six aggregated over all mentions of the phrase in the set. Edges E are constructed by computing cosine similarity between context vectors and then picking the top 25 most weighted edges adjacent to either of the nodes involved in the edge, to reduce size and remove spurious edges due to frequently occurring phrases. Due to large context windows this graph can have edges between positive and negative sentiment words. The authors propose that the algorithm handles this by computing polarity as the aggregate of all the best paths to seed words. Choosing the best path to seed word rather than all the paths as in Label Propagation, is the main difference between the two approaches.

Task Description and Evaluation

  • Lexicon Statistics:
    • Generated using 187 Positive Seeds and 192 Negative Seeds manually annotated causing a lexicon of size 178104 to be generated.
  • Comparison with other Lexicon Sets:
    • Wilson et al 7628 Phrases [2710 Positive 4910 Negative]
    • WordNet LP 12310 Phrases [5705 Positive 6605 Negative]
    • Web GP : Web derived lexicon 178104 Phrases [90337 Positive 87767 Negative]
    • Significantly large when compared to other benchmark lexicon sets.
  • Qualitative Evaluation:
    • Most frequent phrase lengths are 2 (~60%), 1 (~21%) and 3 (~16%). Longer phrases less frequent.
    • Multi-word Phrases("just what the doctor ordered"[+] and "out of touch"[-])
    • Spelling variations for positive phrases(like "cooool","kewl") more prominent than for negative phrases.
    • Vulgar, derogatory and racial slurs abundant in phrases that achieved negative sentiment.
  • Quantitative Evaluation:
    • Performance measured on Sentence Sentiment Classification/Ranking task.
    • Dataset of 554 consumer reviews described in McDonald et al, 2007. 3916 sentences with 1525 positive, 1542 negative and 849 neutral sentences.
    • Evaluation:
      • Lexicon classifier
        • Classification done using augmented Vote-Flip algorithm
        • Ranking done using Purity[-1,1] of sentence X.
      • Contextual Classifier: Maximum_Entropy_model trained and evaluated on 10-fold cross validation on evaluation data.
        • Meta Classifier: Contextual classifier using features derived from all lexicons.
  • Results:

WebGP tab4.jpg

Findings

  • Web Derived lexicons seem to capture phrases not captured by earlier systems like spelling variations, slang, vulgarity and multi word expressions. This could be attributed to the large search space in which the algorithm runs (unlike systems that use "Wordnet" which will not have nodes representing such phrases.) Hence we see that web derived lexicons seem to capture a wide range of vocabulary prevalent on the web.
  • The paper reports that Web derived lexicon have a superior performance to previously published English lexicons.
  • One major advantage of above technique is the independence from language dependent resources like WordNet and making use of unlabeled data, which can be readily available. This makes it ideal for extending to different languages easily where structured linguistic resources might not be present yet have abundant unlabeled resources.

Related papers

  • Description of WordNet LP

S. Blair-Goldensohn, K. Hannan, R. McDonald, T. Neylon,G.A. Reis, and J. Reynar. 2008. Building a sentiment summarizer for local service reviews. In NLP in the Information Explosion Era.

  • Wilson et al. Lexicon set.

T. Wilson, J. Wiebe, and P. Hoffmann. 2005. Recognizing contextual polarity in phrase-level sentiment analysis. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP).

  • Dataset for evaluation McDonald et al.

R. McDonald, K. Hannan, T. Neylon, M. Wells, and J. Reynar. 2007. Structured models for fine-to-coarse sentiment analysis. In Proceedings of the Annual Conference of the Association for Computational Linguistics (ACL).

  • Related work in Japanese.

N. Kaji and M. Kitsuregawa. 2007. Building lexicon for sentiment analysis from massive collection of HTML documents. In Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL).

  • Label Propogation

X. Zhu and Z. Ghahramani. 2002. Learning from labeled and unlabeled data with label propagation. Technical report, CMU CALD tech report CMU-CALD-02.

Study plan