Difference between revisions of "Structured Models for Fine-to-Coarse Sentiment Analysis"

From Cohen Courses
Jump to navigationJump to search
 
(11 intermediate revisions by the same user not shown)
Line 10: Line 10:
  
 
== Summary ==
 
== Summary ==
This paper proposes a novel approach to find sentiments at several granular levels ( paragraph, sentence, phrase, word ) in a document. It introduces a single [[UsesMethod::structured model]] for classifying text sentiment at these levels. There are various applications of sentiment classification at different levels of granularity.
+
This paper proposes a novel approach to finding sentiments at several granular levels (document, paragraph, sentence, phrase, word ). This paper uses a single [[UsesMethod::structured model]] that transforms the multi-level sentiment classification task into a a single problem of learning from sequence of granular components. It uses [[UsesMethod::Structured Linear Classifier]] with constrained [[UsesMethod::viterbi]] for inference. Single model approach performs better than models trained in isolation for a given level of granularity. It considers two important ideas in modelling, firstly, higher level classification can benefit from granular level classification and secondly, granular level classification can benefit from higher level classification.
 +
 
 +
=== DataSet ===
 +
The dataset is compiled by authors by collecting review from three domains, cars seats for childrens, fitness equipments, Mp3 players. Reviews have document level labels and sentence level labels.
 +
 
 +
[[File:dataset.png]]
 +
 
 +
== Methodoloy ==
 +
[[File:Model1.png]]
 +
 
 +
The above figure represents un-directed graphical model to jointly classify sentence and document sentiment. The label for each sentence is dependent on the labels of its neighboring sentence and label of the document. The document label is dependent on label of every sentence. The dotted lines show that sentences are input and not modeled.
  
 
== Results ==
 
== Results ==
 +
The results are given for two levels of granularity (sentence and document).
 +
Three baseline systems for comparison
 +
#'''Document-classifier''' learns to predict document label only.
 +
#'''Sentence-classifier''' learns to predict isolated sentence label only.
 +
#'''Sentence-Structured''' learns to predict the sentence label by considering the document as a sequence of sentences and uses sequence chain model.
 +
 +
Two alternative to fine-to-coarse systems ( cascaded models ) for comparison
 +
#'''Sentence-Structured Model->Document Classifier'''
 +
#'''Document Classifier->Sentence-Structured Model'''
 +
 +
The proposed model is called Joint-Structured.
 +
 +
[[File:result1.png]]
  
 
== Discussion ==
 
== Discussion ==
 +
*The authors have good insight into sentiment classification problem and it does happen that sentence can have different polarity on the basis of context and overall document's sentiment need not be just aggregate of all the sub-components. So they have good underlying idea to consider sub-component labels and components label altogether.
 +
*The joint model has significant accuracy improvement from 62.6% to 70.3% for amazon product reviews sentence classification.
 +
 +
*The paper uses product reviews dataset which tends to have small documents. It would be helpful to see model performance on large text corpora.
 +
 +
*As mentioned in the paper, this type of model is easy to extend for sparse labeled data and has significant utility.
  
 
== Related papers ==
 
== Related papers ==
 +
*[[Related Paper::Choi et al 2005]], [[Related Paper::Choi et al 2006]]
 +
**Uses CRFs to learn global sequence model to classify and assign sources to opinion.
 +
*[[Related Paper::Mao and G. Lebanon, 2006]]
 +
**Uses sequential CRF regression model to measure polarity on the sentence level in order to determine sentiment flow of authors in reviews.
 +
*[[Related Paper::Pang and Lee, 2004]]
 +
**Cascaded Model ( alternative to fine-to-coarse methodology)
  
 
== Study plan ==
 
== Study plan ==
 +
*[[Structured_Linear_Classifier]]
 +
**[[Viterbi]]
 +
*[[Choi_et_al_2006]]
 +
**[[Choi_et_al_2005]]

Latest revision as of 11:06, 6 November 2012

This Paper is reviewed for Social Media Analysis 10-802 in Fall 2012.

Citation

Ryan Mcdonald , Kerry Hannan , Tyler Neylon , Mike Wells , Jeff Reynar, 2007, In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics .

Online version

Structured Models for Fine-to-Coarse Sentiment Analysis

Summary

This paper proposes a novel approach to finding sentiments at several granular levels (document, paragraph, sentence, phrase, word ). This paper uses a single structured model that transforms the multi-level sentiment classification task into a a single problem of learning from sequence of granular components. It uses Structured Linear Classifier with constrained viterbi for inference. Single model approach performs better than models trained in isolation for a given level of granularity. It considers two important ideas in modelling, firstly, higher level classification can benefit from granular level classification and secondly, granular level classification can benefit from higher level classification.

DataSet

The dataset is compiled by authors by collecting review from three domains, cars seats for childrens, fitness equipments, Mp3 players. Reviews have document level labels and sentence level labels.

Dataset.png

Methodoloy

Model1.png

The above figure represents un-directed graphical model to jointly classify sentence and document sentiment. The label for each sentence is dependent on the labels of its neighboring sentence and label of the document. The document label is dependent on label of every sentence. The dotted lines show that sentences are input and not modeled.

Results

The results are given for two levels of granularity (sentence and document). Three baseline systems for comparison

  1. Document-classifier learns to predict document label only.
  2. Sentence-classifier learns to predict isolated sentence label only.
  3. Sentence-Structured learns to predict the sentence label by considering the document as a sequence of sentences and uses sequence chain model.

Two alternative to fine-to-coarse systems ( cascaded models ) for comparison

  1. Sentence-Structured Model->Document Classifier
  2. Document Classifier->Sentence-Structured Model

The proposed model is called Joint-Structured.

Result1.png

Discussion

  • The authors have good insight into sentiment classification problem and it does happen that sentence can have different polarity on the basis of context and overall document's sentiment need not be just aggregate of all the sub-components. So they have good underlying idea to consider sub-component labels and components label altogether.
  • The joint model has significant accuracy improvement from 62.6% to 70.3% for amazon product reviews sentence classification.
  • The paper uses product reviews dataset which tends to have small documents. It would be helpful to see model performance on large text corpora.
  • As mentioned in the paper, this type of model is easy to extend for sparse labeled data and has significant utility.

Related papers

Study plan