Difference between revisions of "Cross-Lingual Mixture Model for Sentiment Classification, Xinfan Meng, Furu Wei, Xiaohua Liu, Ming Zhou, Ge Xu, Houfeng Wang, ACL 2012"

From Cohen Courses
Jump to navigationJump to search
 
(5 intermediate revisions by the same user not shown)
Line 1: Line 1:
== Citation ==
+
==Citation==
Revisiting the Predictability of Language: Response Completion in Social Media, Bo Pang Sujith Ravi, EMNLP 2012
 
  
== Online version ==
+
Cross-Lingual Mixture Model for Sentiment Classification, Xinfan Meng, Furu Wei, Xiaohua Liu, Ming Zhou, Ge Xu, Houfeng Wang, ACL 2012
  
An online pdf version is here[http://aclweb.org/anthology-new/D/D12/D12-1136.pdf]
+
==Online version==
  
== Summary ==
+
An online pdf version is here[http://www.aclweb.org/anthology-new/P/P12/P12-1060.pdf]
  
This paper proposed a method for automatic response completion in Social Media context by considering mainly two factors:
+
==Summary==
  
 +
This paper propose a cross-lingual mixture model (CLMM) to tackle the problem of cross-lingual sentiment classification. The motivation for this work is the lack of labeled data in target language (therefore, we want to bring labeled data in source language to help).
 +
Having a labeled source data Ds, a parallel corpus U and an optional labeled target data Dt, they maximize the log-likelihood function for the parallel corpus.
  
1) The language used in responses (By using Language Model[LM] (bigram model & trigram model(both back-off to unigram)))
+
[[File:P12-1060.png]]
  
 +
This is to say that a word in the parallel corpus is generative by 1) directly generate a Chinese word according to the polarity of the sentence OR 2) first generate an English word with the same polarity and meaning, and then translate it to a Chinese word.
 +
At the same time, they want to maximize the log-likelihood function for the source data (and the target data, optional).
 +
The words projection probability is given by the Berkeley aligner. The words generation probability given sentimental class is estimated using EM.
 +
Finally, the words generation probability can be used in Naive Bayes classifier.
  
2) The specific context provided by the original message.
+
==Evaluation==
  
The author used the following things to model the part.
+
The author evaluate CLMM's performance using [[http://malt.ml.cmu.edu/mw/index.php/Dataset:MPQA MPQA]] and [[http://malt.ml.cmu.edu/mw/index.php/NTCIR-6_Opinion NTCIR]] in mainly two cases:
 +
1) Keep the labeled data in target language (Chinese) unavailable.
 +
Can greatly improve the performance (71%) comparing with MT-SVM(52%-62%) and MT-Cotrain(59%-65%).
 +
2) Using the labeled target language (Chinese) data.
 +
Still beat the baseline SVM (using the labeled data in Chinese to train the model), and can compete other state-of-art methods like the Joint-Train & MT-Cotrain (Wan, 2009), while need less time in training.
  
[TM] Methods In Ritter et. al 2010, Data-Driven Response Generation in Social Media, which is to use a translation model to do alignment between stimulus(source) and the response(target). [IBM-Model1]
+
==Discussion==
  
[Selection model] To select a token in stimulus uniformly at random.
+
If we only use MT results directly, we will suffer from two things 1) The vocabulary is quite limited, hence many words in target language can not enjoy the help from the source language. 2) The MT systems have defects, Like "too good to be true" can be positive after MT into Chinese, where leads to errors. By using CLMM in this paper, we can partly solve these two problems. But it seems like, it is still a way to find sentimental words in target language. In the experiment we can clearly see that, when we have labeled target language data, the improvement given by this method is quite limited. Not sure if other auto sentimental words expands methods can also help the system like CLMM.
  
[Topic model] First learn a topic model over conversations in the training data using LDA. Then identify the most likely topic of the conversation based on s, and expect responds to be generated from this topic.
+
==Related papers==
  
 
+
Xiaojun Wan. 2009. Co-training for cross-lingual senti- ment classification.
The author used a '''linear''' combination to  mixture these two factors (models).
+
Bin Lu, Chenhao Tan, Claire Cardie, and Benjamin K. Tsou. 2011. Joint bilingual sentiment classification with unlabeled parallel corpora.
 
 
== Evaluation ==
 
 
 
The author claims that translation-based approach is not well suited for this particular task and LDA suffers from the fact that the text is noisy (or too generic) therefore, not useful enough to help in the prediction task.
 
 
 
== Discussion ==
 
 
 
The author provides an analysis (entropy estimates along with upper-bound numbers observed from experiments) and suggests that there can be interesting future work to explore the contextual information provided by the stimulus more effectively and further improve the response completion task.
 
 
 
== Related papers ==
 
 
 
Ritter et. al 2010, Data-Driven Response Generation in Social Media
 
 
 
Regina Barzilay and Mirella Lapata. 2005, Modeling local coherence: An entity-based approach
 
 
 
== Study plan ==
 
 
 
Language Model: [http://en.wikipedia.org/wiki/Language_model]
 
 
 
Machine Translation, IBM Model-1 [http://acl.ldc.upenn.edu/J/J93/J93-2003.pdf]
 
 
 
LDA [http://en.wikipedia.org/wiki/Latent_Dirichlet_allocation]
 
 
 
 
 
== Data Set==
 
[[http://malt.ml.cmu.edu/mw/index.php/Dataset:MPQA MPQA]][[http://malt.ml.cmu.edu/mw/index.php/NTCIR-6_Opinion NTCIR]]
 

Latest revision as of 21:52, 1 October 2012

Citation

Cross-Lingual Mixture Model for Sentiment Classification, Xinfan Meng, Furu Wei, Xiaohua Liu, Ming Zhou, Ge Xu, Houfeng Wang, ACL 2012

Online version

An online pdf version is here[1]

Summary

This paper propose a cross-lingual mixture model (CLMM) to tackle the problem of cross-lingual sentiment classification. The motivation for this work is the lack of labeled data in target language (therefore, we want to bring labeled data in source language to help). Having a labeled source data Ds, a parallel corpus U and an optional labeled target data Dt, they maximize the log-likelihood function for the parallel corpus.

P12-1060.png

This is to say that a word in the parallel corpus is generative by 1) directly generate a Chinese word according to the polarity of the sentence OR 2) first generate an English word with the same polarity and meaning, and then translate it to a Chinese word. At the same time, they want to maximize the log-likelihood function for the source data (and the target data, optional). The words projection probability is given by the Berkeley aligner. The words generation probability given sentimental class is estimated using EM. Finally, the words generation probability can be used in Naive Bayes classifier.

Evaluation

The author evaluate CLMM's performance using [MPQA] and [NTCIR] in mainly two cases: 1) Keep the labeled data in target language (Chinese) unavailable. Can greatly improve the performance (71%) comparing with MT-SVM(52%-62%) and MT-Cotrain(59%-65%). 2) Using the labeled target language (Chinese) data. Still beat the baseline SVM (using the labeled data in Chinese to train the model), and can compete other state-of-art methods like the Joint-Train & MT-Cotrain (Wan, 2009), while need less time in training.

Discussion

If we only use MT results directly, we will suffer from two things 1) The vocabulary is quite limited, hence many words in target language can not enjoy the help from the source language. 2) The MT systems have defects, Like "too good to be true" can be positive after MT into Chinese, where leads to errors. By using CLMM in this paper, we can partly solve these two problems. But it seems like, it is still a way to find sentimental words in target language. In the experiment we can clearly see that, when we have labeled target language data, the improvement given by this method is quite limited. Not sure if other auto sentimental words expands methods can also help the system like CLMM.

Related papers

Xiaojun Wan. 2009. Co-training for cross-lingual senti- ment classification. Bin Lu, Chenhao Tan, Claire Cardie, and Benjamin K. Tsou. 2011. Joint bilingual sentiment classification with unlabeled parallel corpora.