Ling and He Joint Sentiment Topic Model for Sentiment Analysis

From Cohen Courses
Revision as of 02:36, 2 October 2012 by Nkatariy (talk | contribs) (Created page with '== Citation == author = {Lin, Chenghua and He, Yulan}, title = {Joint sentiment/topic model for sentiment analysis}, booktitle = {Proceedings of the 18th ACM conference on Inf…')
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Citation

author = {Lin, Chenghua and He, Yulan},
title = {Joint sentiment/topic model for sentiment analysis},
booktitle = {Proceedings of the 18th ACM conference on Information and knowledge management},
series = {CIKM '09},
year = {2009},
isbn = {978-1-60558-512-3},
location = {Hong Kong, China},
pages = {375--384},
numpages = {10},
url = {http://doi.acm.org/10.1145/1645953.1646003},
doi = {10.1145/1645953.1646003},
acmid = {1646003},
publisher = {ACM},
address = {New York, NY, USA},
keywords = {joint sentiment/topic model, latent dirichlet allocation, opinion mining, sentiment analysis}

Online Version

Joint Sentiment/Topic Model for Sentiment Analysis

Summary

This paper proposes a novel probabilistic modeling framework based on Latent Dirichlet Allocation (LDA) which detects sentiment and topics simultaneously from text. Unlike other machine learning approaches to sentiment classification which often require labeled corpora for classifier training, the proposed model is fully unsupervised.


LDA

LDA is a generative probabilistic model for collections of discrete data such as text corpora. It is a three-level hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying (latent) set of topics, where each topic is characterized by a distribution over words. Each document is assumed to be generated using the following process:

 1. Choose the number of words  in the document by drawing from a distribution Poisson()
 2. Choose the topic probabilities  from a Dirichlet() distribution
 3. For each of the N words 
   a. Choose a topic  from a Multinomial({) distrbution
   b. Choose a word  from p() which is a multinomial distribution conditioned on the topic 

The parameters and are corpus level parameters, assumed to be sampled once in the process of generating a corpus. The variables are document-level variables, sampled once per document. Finally, the variables and are word-level variables and are sampled once for each word in each document.

Inference

The posterior distribution of the hidden variables given a document is intractable. Efficient approximate inference techniques based on variational methods and an EM algorithm for empirical Bayes parameter estimation are provided.

The basic idea is to make use of Jensen’s inequality to obtain an adjustable lower bound on the log likelihood. A family of lower bounds, indexed by a set of variational parameters, is considered and the variational parameters are chosen by an optimization procedure that attempts to find the tightest possible lower bound. It leads to the following iterative EM algorithm

 1. E step: For each document, find the optimizing values of the variational parameters
 2. M step: Maximize resulting lower bound on the log likelihood with respect to the model parameters  

Experiments

LDA is empirically evaluated in several problem domains -- document modeling, document classification, and collaborative filtering.

Study Plan

1. Mixture models

2. Probabilistic Latent Semantic Indexing

3. Variational Bayesian Methods

4. Variational Inference lecture pdf by Blei