Difference between revisions of "BinLu et al. ACL2011"

From Cohen Courses
Jump to navigationJump to search
Line 17: Line 17:
  
 
This paper compared their method with other 3 kind of state-of-the-art baseline algorithms.
 
This paper compared their method with other 3 kind of state-of-the-art baseline algorithms.
   1. The first kind of baseline algorithms are training separate classifiers on different languages. For this kind, the authors used [[UsesMethod::Maximum Entropy model|MaxEnt]], [[SVM]] and Monolingual [[UsesMethod::transductive SVM|TSVM]]
+
   1. The first kind of baseline algorithms are training separate classifiers on different languages. For this kind, the authors used [[Maximum Entropy model|MaxEnt]], [[SVM]] and Monolingual [[transductive SVM|TSVM]]
   2. The second kind of baseline is Bilingual [[UsesMethod::transductive SVM|TSVM]]
+
   2. The second kind of baseline is Bilingual [[transductive SVM|TSVM]]
 
   3. The third kind is semi-supervised learning strategy [[Co-training]]
 
   3. The third kind is semi-supervised learning strategy [[Co-training]]
  

Revision as of 21:04, 26 September 2012

Citation

Joint Bilinguial Sentiment Classification with Unlabeled Parallel Corpora, Bin Lu, Chenhao Tan, Claire Cardie and Benjamin K. Tsou, ACL 2011

Online version

Joint Bilingual Sentiment Classification with Unlabeled Parallel Corpora

Summary

This paper address the Sentiment analysis problem on sentence level for multiple languages. They propose to leverage parallel corpora to learn a MaxEnt-based EM model that consider both languages simultaneously under the assumption that sentiment labels for parallel sentences should be similar.

The experimented on 2 dataset: MPQA Multi-Perspective Question Answering and NTCIR-6 Opinion

Evaluation

This paper compared their method with other 3 kind of state-of-the-art baseline algorithms.

 1. The first kind of baseline algorithms are training separate classifiers on different languages. For this kind, the authors used MaxEnt, SVM and Monolingual TSVM
 2. The second kind of baseline is Bilingual TSVM
 3. The third kind is semi-supervised learning strategy Co-training

Discussion

This paper poses two important social problems related to bipartite social graphs and explained how those problems can be solved efficiently using random walks.

They also claim that the neighborhoods over nodes can represent personalized clusters depending on different perspectives.

During presentation one of the audiences raised question about is anomaly detection in this paper similar to betweenness of edges defined in Kleinber's text as discussed in Class Meeting for 10-802 01/26/2010. I think they are similar. In the texbook they propose, detecting edges with high betweenness and using them to partition the graph. In this paper they first try to create neighbourhood partitions based on random walk prbabilities and which as a by product gives us nodes and edges with high betweenness value.


Related papers

In sense of multilingual sentiment analysis, there several works like:

  • Paper:Learning multilingual subjective language via cross-lingual projections:[1]
  • Paper:Multilingual subjectivity: Are more languages better?:[2]
  • Paper:Cross-language text classification using structural correspondence learning.:[3]

In sense of semi-supervised learning, related papers include:

  • Paper:Combining labeled and unlabeled data with co-training:[4]
  • Paper:Text classification from labeled and unlabeled documents using EM.:[5]

Study plan