Difference between revisions of "Ling and He Joint Sentiment Topic Model for Sentiment Analysis"

From Cohen Courses
Jump to navigationJump to search
(Created page with '== Citation == author = {Lin, Chenghua and He, Yulan}, title = {Joint sentiment/topic model for sentiment analysis}, booktitle = {Proceedings of the 18th ACM conference on Inf…')
 
Line 20: Line 20:
  
 
== Summary ==
 
== Summary ==
This [[Category::paper]] proposes a novel probabilistic modeling framework based on Latent Dirichlet Allocation (LDA) which detects sentiment and topics simultaneously from text. Unlike other machine learning approaches to sentiment classification which often require labeled corpora for classifier training, the proposed model is fully unsupervised.  
+
This [[Category::paper]] proposes a novel probabilistic modeling framework based on Latent Dirichlet Allocation (LDA) which detects sentiment and topics simultaneously from text. Unlike other machine learning approaches to sentiment classification which often require labeled corpora for classifier training, the proposed model is fully unsupervised.
  
 +
Each document in the Joint Sentiment Topic (JST) model is associated with <math>S</math> (number of sentiment labels) topic-document distributions, each of which corresponds to a sentiment label <math>l</math> with the same number of topics. Finally, one draws a word from
 +
distribution over words defined by the topic and sentiment label.
  
=== LDA ===
+
  1. For each document <math>d</math>, choose a distribution <math>\pi_d</math> ~ Dirichlet(<math>\gamma</math>)
LDA is a generative probabilistic model for collections of discrete data such as text corpora. It is a three-level hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying (latent) set of topics, where each topic is characterized by a distribution over words. Each document <math>d</math> is assumed to be generated using the following process: 
+
   2. For each sentiment label <math>l</math> under document <math>d</math>, choose a distribution <math>\theta_{d,l}</math> ∼ Dir(<math>\alpha</math>).
 
+
   3. For each word <math>w_i</math> in document <math>d</math>
  1. Choose the number of words <math>N_d</math> in the document by drawing from a distribution Poisson(<math>\xi</math>)
+
     a. Choose a sentiment label <math>l_i</math> <math>\pi_d</math>
   2. Choose the topic probabilities <math>\theta_{d,n}</math> from a Dirichlet(<math>\alpha</math>) distribution
+
     b. Choose a topic <math>z_i</math> ~ <math>\theta_{d,l_i}</math>
   3. For each of the N words <math>w_{d,n}</math>
+
    c. Choose a word <math>w_i</math> from the distribution over words defined by the topic <math>z_i</math> and sentiment label <math>l_i, \phi^{l_i}_{z_i}</math>  
     a. Choose a topic <math>z_{d,n}</math> from a Multinomial({<math>\theta{d,n}</math>) distrbution
 
     b. Choose a word <math>w_{d,n}</math> from p(<math>w_{d,n} | z_{d,n}, \beta</math>) which is a multinomial distribution conditioned on the topic <math>z_{d,n}</math>
 
 
 
The parameters <math>\alpha</math> and <math>\beta</math> are corpus level parameters, assumed to be sampled once in the process of generating a corpus. The variables <math>\theta_{d,n}</math> are document-level variables, sampled once per document. Finally, the variables <math>z_{d,n}</math> and <math>w_{d,n}</math> are word-level variables and are sampled once for each word in each document.
 
  
 
=== Inference ===  
 
=== Inference ===  
The posterior distribution of the hidden variables given a document is intractable. Efficient approximate inference techniques based on variational methods and an EM algorithm for empirical Bayes parameter estimation are provided.
+
Gibbs sampling algorithm is provided for estimating the posterior distribution of the latent variables given a document.
  
The basic idea is to make use of Jensen’s inequality to obtain an adjustable lower bound on the log likelihood. A family of lower bounds, indexed by a set of variational parameters, is considered and the variational parameters are chosen by an optimization procedure that attempts to find the tightest possible lower bound. It leads to the following iterative EM algorithm
+
=== Tying-JST model ===
  1. E step: For each document, find the optimizing values of the variational parameters
+
One has to choose a topic-document distribution <math>\theta_d</math> for every document under the JST model, whereas in tying-JST there is
  2. M step: Maximize resulting lower bound on the log likelihood with respect to the model parameters <math>\alpha, \beta</math>  
+
only one topic-document distribution <math>\theta</math> which accounts for all the documents in the corpus.
  
 
=== Experiments ===
 
=== Experiments ===
LDA is empirically evaluated in several problem domains -- document modeling, document classification, and collaborative filtering.
+
The authors used a corpus of preprocessed [[UsesDataset::Pang Movie Reviews|movie reviews]] for evaluating the performance of the JST model.  
  
 
== Study Plan ==
 
== Study Plan ==
1. [http://en.wikipedia.org/wiki/Mixture_model Mixture models]
 
 
2. [http://www.cs.brown.edu/~th/papers/Hofmann-SIGIR99.pdf Probabilistic Latent Semantic Indexing]
 
 
3. [http://en.wikipedia.org/wiki/Variational_Bayesian_methods Variational Bayesian Methods]
 
 
4. [http://www.cs.princeton.edu/courses/archive/fall11/cos597C/lectures/variational-inference-i.pdf Variational Inference lecture pdf by Blei]
 

Revision as of 03:54, 2 October 2012

Citation

author = {Lin, Chenghua and He, Yulan},
title = {Joint sentiment/topic model for sentiment analysis},
booktitle = {Proceedings of the 18th ACM conference on Information and knowledge management},
series = {CIKM '09},
year = {2009},
isbn = {978-1-60558-512-3},
location = {Hong Kong, China},
pages = {375--384},
numpages = {10},
url = {http://doi.acm.org/10.1145/1645953.1646003},
doi = {10.1145/1645953.1646003},
acmid = {1646003},
publisher = {ACM},
address = {New York, NY, USA},
keywords = {joint sentiment/topic model, latent dirichlet allocation, opinion mining, sentiment analysis}

Online Version

Joint Sentiment/Topic Model for Sentiment Analysis

Summary

This paper proposes a novel probabilistic modeling framework based on Latent Dirichlet Allocation (LDA) which detects sentiment and topics simultaneously from text. Unlike other machine learning approaches to sentiment classification which often require labeled corpora for classifier training, the proposed model is fully unsupervised.

Each document in the Joint Sentiment Topic (JST) model is associated with (number of sentiment labels) topic-document distributions, each of which corresponds to a sentiment label with the same number of topics. Finally, one draws a word from distribution over words defined by the topic and sentiment label.

 1. For each document , choose a distribution  ~ Dirichlet()
 2. For each sentiment label  under document , choose a distribution  ∼ Dir().
 3. For each word  in document 
   a. Choose a sentiment label 
   b. Choose a topic  ~ 
   c. Choose a word  from the distribution over words defined by the topic  and sentiment label  

Inference

Gibbs sampling algorithm is provided for estimating the posterior distribution of the latent variables given a document.

Tying-JST model

One has to choose a topic-document distribution for every document under the JST model, whereas in tying-JST there is only one topic-document distribution which accounts for all the documents in the corpus.

Experiments

The authors used a corpus of preprocessed movie reviews for evaluating the performance of the JST model.

Study Plan