Difference between revisions of "M. Hurst and K. Nigam. Retrieving topical sentiments from online document collection."

From Cohen Courses
Jump to navigationJump to search
Line 35: Line 35:
  
 
== Task Description and Evaluation ==
 
== Task Description and Evaluation ==
Topic Identification
+
 
Polarity Identification
+
Polarity Identification:
 +
 
 +
The authors use a rule based approach to perform polarity identification. It has the following steps
 +
- Tokenization followed by POS Tagging using a statistical tagger trained on PennTreebank Data.
 +
- Semantic Polarity tagging using manually created predefined Topical Lexicon tuned for the domain.
 +
- Chunking using simple POS Tag patterns
 +
- Rule based Syntactic patterns and negations rules to modify and associate polarity to topics.
 +
- Syntactic patterns are: Predicative modi�fication (it is good), Attributive modi�cation (a good car), Equality (it is a good car), Polar clause (it broke my car). Negation Rules: Verbal attachment (it is not good, it isn't good)
 +
 
 +
Performance:
 +
There system achieved a precision of 82% at detecting positive polarity and precision of 80% for detecting negative polarity.
 +
 
 +
 
 +
Topic Identification:
 +
 
 +
 
 
Topic and Polarity Identification
 
Topic and Polarity Identification
  

Revision as of 22:04, 4 November 2012

This a Paper reviewed for Social Media Analysis 10-802 in Fall 2012.

Citation

 title={Retrieving topical sentiments from online document collections},
 author={Hurst, M.F. and Nigam, K.},
 booktitle={Proceedings of SPIE},
 volume={5296},
 pages={27--34},
 year={2004}

Online version

Retrieving topical sentiments from online document collection

Summary

This is one of the earlier works at combining Topicality and Polarity i.e identifying polar sentences about a topic. Here authors argue for the fusion of Topicality and Polarity by using statistical machine learning approaches to identify topics and shallow NLP techniques to determine polarity. They argue that polar sentences that contain the topic, denote polarity about the topic.

Dataset Description

16, 616 sentences from 982 messages from online resources(usenet, online message boards, etc.) about a certain topic. Manually annotated 250 Randomly selected sentences with following labels

- Polarity Identification: positive, negative

- Topic Identification: Topical, Out-of-Topic

- Polarity and Topic Identification: positive-correlated, negative-correlated, positive-uncorrelated, negative-uncorrelated, topical, out-of-topic. The positive-correlated label indicates that the sentences contained a positive polar segment that referred to the topic, positive-uncorrelated indicates that there was some positive polarity but that it was not associated with the topic in question.

Features Used

Task Description and Evaluation

Polarity Identification:

The authors use a rule based approach to perform polarity identification. It has the following steps - Tokenization followed by POS Tagging using a statistical tagger trained on PennTreebank Data. - Semantic Polarity tagging using manually created predefined Topical Lexicon tuned for the domain. - Chunking using simple POS Tag patterns - Rule based Syntactic patterns and negations rules to modify and associate polarity to topics. - Syntactic patterns are: Predicative modi�fication (it is good), Attributive modi�cation (a good car), Equality (it is a good car), Polar clause (it broke my car). Negation Rules: Verbal attachment (it is not good, it isn't good)

Performance: There system achieved a precision of 82% at detecting positive polarity and precision of 80% for detecting negative polarity.


Topic Identification:


Topic and Polarity Identification

Findings

Related papers

Study plan