Difference between revisions of "Ling and He Joint Sentiment Topic Model for Sentiment Analysis"

Revision as of 23:18, 3 October 2012

Citation

author = {Lin, Chenghua and He, Yulan},
title = {Joint sentiment/topic model for sentiment analysis},
booktitle = {Proceedings of the 18th ACM conference on Information and knowledge management},
series = {CIKM '09},
year = {2009},
isbn = {978-1-60558-512-3},
location = {Hong Kong, China},
pages = {375--384},
numpages = {10},
url = {http://doi.acm.org/10.1145/1645953.1646003},
doi = {10.1145/1645953.1646003},
acmid = {1646003},
publisher = {ACM},
address = {New York, NY, USA},
keywords = {joint sentiment/topic model, latent dirichlet allocation, opinion mining, sentiment analysis}

Online Version

Joint Sentiment/Topic Model for Sentiment Analysis

Summary

This paper proposes a probabilistic modeling framework based on Latent Dirichlet Allocation (LDA) which detects sentiment and topics simultaneously from text. Most approaches to sentiment analysis require labeled corpora for training and inference. The proposed model, however, is fully unsupervised i.e. does not require any labeled data as such. The authors hint towards why a joint model might is being sought. Sentiment polarities, intuitively, are dependent on topics or domains. For instance, though the adjective ‘unpredictable’ in a phrase such as ‘unpredictable steering’ has negative orientation in an automobile review, it has a positive orientation in a phrase like ‘unpredictable plot’ in a movie review.

The Joint Sentiment Topic (JST) model is similar in flavor to LDA the only major difference being an addition of a sentiment layer between the document and the topic layer. Each document in the Joint Sentiment Topic (JST) model is associated with distinct topic-document distributions corresponding to each of the $S$ sentiment labels. Each of these distributions correspond to a sentiment label $l$ with the same number of topics. The distribution defined by the topic and the sentiment label is then used to draw the word.

 1. For each document  $d$ , choose a distribution  $\pi _{d}$  ~ Dirichlet( $\gamma$ )
 2. For each sentiment label  $l$  under document  $d$ , choose a distribution  $\theta _{d,l}$  ∼ Dir( $\alpha$ ).
 3. For each word  $w_{i}$  in document  $d$ 
   a. Choose a sentiment label  $l_{i}$  ∼  $\pi _{d}$ 
   b. Choose a topic  $z_{i}$  ~  $\theta _{d,l_{i}}$ 
   c. Choose a word  $w_{i}$  from the distribution over words deﬁned by the topic  $z_{i}$ , sentiment label  $l_{i}$  and another parameter  $\phi _{z_{i}}^{l_{i}}$

The hyperparameter $\alpha$ can be treated as our prior belief about the number of times topic j was associated with sentiment label l sampled from a document and $\beta$ can be viewed as the number of times words sampled from topic j are associated with sentiment label l before having observed any actual words. Similarly, $\gamma$ can be interpreted as the prior observation counts for the number of times sentiment label l sampled from document before any words from the corpus is observed. In JST, there are three sets of latent variables to be inferred: the joint sentiment/topic-document distribution $\theta _{d,l}$ , the joint sentiment/topic-word distribution $\phi _{z_{i}}^{l_{i}}$ , and the sentiment-document distribution $\pi _{d}$ .

Inference

A Gibbs sampling algorithm is provided for estimating the posterior distribution of the latent variables given a document. The algorithm sequentially samples each variable of interest from the distribution over that variable given the current values of all other variables and the data. The sequential update is carried on for a fixed number of Gibbs Sampling iterations

Tying-JST model

One has to choose a topic-document distribution $\theta _{d}$ for every document under the JST model. The tying-JST model simplifies this a bit. In the tying-JST model, there is only one topic-document distribution $\theta$ which accounts for all the documents in the corpus.

Experiments

The authors used a corpus of preprocessed Pang Movie Reviews for evaluating the performance of the JST model.

Study Plan

1. Blei_et_al_Latent_Dirichlet_Allocation

@@ Line 20: / Line 20: @@
 == Summary ==
-This [[Category::paper]] proposes a probabilistic modeling framework based on Latent Dirichlet Allocation (LDA) which detects sentiment and topics simultaneously from text. Most approaches to sentiment analysis require labeled corpora for training and inference. The proposed model, however, is fully unsupervised i.e. does not require any labeled data as such.
+This [[Category::paper]] proposes a probabilistic modeling framework based on Latent Dirichlet Allocation (LDA) which detects sentiment and topics simultaneously from text. Most approaches to sentiment analysis require labeled corpora for training and inference. The proposed model, however, is fully unsupervised i.e. does not require any labeled data as such. The authors hint towards why a joint model might is being sought. Sentiment polarities, intuitively, are dependent on topics or domains. For instance, though the adjective ‘unpredictable’ in a phrase such as ‘unpredictable steering’ has negative orientation in an automobile review, it has a positive orientation in a phrase like ‘unpredictable plot’ in a movie review.
-The authors hint towards why a joint model might is being sought. Sentiment polarities, intuitively, are dependent on topics or domains. For instance, though the adjective ‘unpredictable’ in a phrase such as ‘unpredictable steering’ has negative orientation in an automobile review, it has a positive orientation in a phrase like ‘unpredictable plot’ in a movie review.
 The Joint Sentiment Topic (JST) model is similar in flavor to LDA the only major difference being an addition of a sentiment layer between the document and the topic layer. Each document in the Joint Sentiment Topic (JST) model is associated with distinct topic-document distributions corresponding to each of the <math>S</math> sentiment labels. Each of these distributions correspond to a sentiment label <math>l</math> with the same number of topics. The distribution defined by the topic and the sentiment label is then used to draw the word.
@@ Line 31: / Line 29: @@
      a. Choose a sentiment label <math>l_i</math> ∼ <math>\pi_d</math>
      b. Choose a topic <math>z_i</math> ~ <math>\theta_{d,l_i}</math>
-     c. Choose a word <math>w_i</math> from the distribution over words deﬁned by the topic <math>z_i</math> and sentiment label <math>l_i, \phi^{l_i}_{z_i}</math>
+     c. Choose a word <math>w_i</math> from the distribution over words deﬁned by the topic <math>z_i</math>, sentiment label <math>l_i</math> and another parameter <math>\phi^{l_i}_{z_i}</math>
+The hyperparameter <math>\alpha</math> can be treated as our prior belief about the number of times topic j was associated with sentiment label l sampled from a document and <math>\beta</math> can be viewed as the number of times words sampled from topic j are associated with sentiment label l before having observed any actual words. Similarly, <math>\gamma</math> can be interpreted as the prior observation counts for the number of times sentiment label l sampled from document before any words from the corpus is observed. In JST,
+there are three sets of latent variables to be inferred: the joint sentiment/topic-document distribution <math>\theta_{d,l}</math>, the joint sentiment/topic-word distribution <math>\phi^{l_i}_{z_i}</math>, and the sentiment-document distribution <math>\pi_d</math>.
 === Inference ===
-Gibbs sampling algorithm is provided for estimating the posterior distribution of the latent variables given a document.
+A Gibbs sampling algorithm is provided for estimating the posterior distribution of the latent variables given a document. The algorithm sequentially samples each variable of interest from the distribution over that variable given the current values of all other variables and the
+data. The sequential update is carried on for a fixed number of Gibbs Sampling iterations
 === Tying-JST model ===
-One has to choose a topic-document distribution <math>\theta_d</math> for every document under the JST model, whereas in tying-JST there is
+One has to choose a topic-document distribution <math>\theta_d</math> for every document under the JST model. The tying-JST model simplifies this a bit. In the tying-JST model, there is only one topic-document distribution <math>\theta</math> which accounts for all the documents in the corpus.
-only one topic-document distribution <math>\theta</math> which accounts for all the documents in the corpus.
 === Experiments ===
-The authors used a corpus of preprocessed [[UsesDataset::Pang Movie Reviews|movie reviews]] for evaluating the performance of the JST model.
+The authors used a corpus of preprocessed [[UsesDataset::Pang Movie Reviews]] for evaluating the performance of the JST model.
 == Study Plan ==
 . [[RelatedPaper::Blei_et_al_Latent_Dirichlet_Allocation]]

Difference between revisions of "Ling and He Joint Sentiment Topic Model for Sentiment Analysis"

Revision as of 23:18, 3 October 2012

Contents

Citation

Online Version

Summary

Inference

Tying-JST model

Experiments

Study Plan

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools