M. Hurst and K. Nigam. Retrieving topical sentiments from online document collection.
This a Paper reviewed for Social Media Analysis 10-802 in Fall 2012.
Contents
Citation
title={Retrieving topical sentiments from online document collections}, author={Hurst, M.F. and Nigam, K.}, booktitle={Proceedings of SPIE}, volume={5296}, pages={27--34}, year={2004}
Online version
Retrieving topical sentiments from online document collection
Summary
This is one of the earlier works at combining Topicality and Polarity i.e identifying polar sentences about a topic. Here authors argue for the fusion of Topicality and Polarity by using statistical machine learning approaches to identify topics and shallow NLP techniques to determine polarity. They argue that polar sentences that contain the topic, denote polarity about the topic.
Dataset Description
16, 616 sentences from 982 messages from online resources(usenet, online message boards, etc.) about a certain topic. Manually annotated 250 Randomly selected sentences with following labels
- Polarity Identification: positive, negative
- Topic Identification: Topical, Out-of-Topic
- Polarity and Topic Identification: positive-correlated, negative-correlated, positive-uncorrelated, negative-uncorrelated, topical, out-of-topic. The positive-correlated label indicates that the sentences contained a positive polar segment that referred to the topic, positive-uncorrelated indicates that there was some positive polarity but that it was not associated with the topic in question.
Features Used
Task Description and Evaluation
Polarity Identification:
The authors use a rule based approach to perform polarity identification. It has the following steps - Tokenization followed by POS Tagging using a statistical tagger trained on PennTreebank Data. - Semantic Polarity tagging using manually created predefined Topical Lexicon tuned for the domain. - Chunking using simple POS Tag patterns - Rule based Syntactic patterns and negations rules to modify and associate polarity to topics. - Syntactic patterns are: Predicative modification (it is good), Attributive modification (a good car), Equality (it is a good car), Polar clause (it broke my car). Negation Rules: Verbal attachment (it is not good, it isn't good)
Performance: There system achieved a precision of 82% at detecting positive polarity and precision of 80% for detecting negative polarity.
Topic Identification:
Topic and Polarity Identification