Difference between revisions of "Compare BinLu Rada Two Papers"

From Cohen Courses
Jump to navigationJump to search
 
(2 intermediate revisions by the same user not shown)
Line 1: Line 1:
 +
== Big Idea ==
 +
The motivation of the two papers are the same. That is there are a large amount of resources in English and they want to use them in target language where there are fewer resources.
 +
 
== Method ==
 
== Method ==
== Dataset Used ==
+
The main difference of the two paper is that, although they all use the parallel corpus 1) Rada Mihalcea use it in a way to project the word in the source language directly to the target language, and use the information (annotation) of the source language word to label the target language word. 2)While BinLu train a MaxEnt-based model then maximizes the joint probability, in which they consider both the language at the same time which is a more promising way in my opinion.
 +
 
 
== Problem ==
 
== Problem ==
== Big Idea ==
+
The problem they deal with is quite similar. Although Rada Mihalcea's work more focused on subjective analysis while the BinLu's work more focused on sentiment analysis on sentence level. They big problem is quite the same. They both want to draw lexicons (mainly) from the source language, and label the target language lexicons which contains information in dealing with the specific task.
 +
 
 
== Other ==
 
== Other ==
 +
Although they all achieve good results in experiments, Rada Mihalcea and BinLu suffers from a same problem which is that, no matter using the dictionary based methods like in Rada Mihalcea's first apporach, or using the parallel corpus (used for alignment in Rada's and use google translation) the lexicon is quite limited actually. Which means, use can get a substantial amount of useful lexicons, but quite many lexicons which are actually very important for the specific task got ignored since they even do not appear in the auto-translation text or the standard dictionary. This will be even more harmful than the ambiguity problem.
  
 
== Additional Questions ==
 
== Additional Questions ==
How much time did you spend reading the (new, non-wikified) paper you summarized? 40 minutes
+
How much time did you spend reading the (new, non-wikified) paper you summarized? 3 hours
  
How much time did you spend reading the old wikified paper? 15 minutes
+
How much time did you spend reading the old wikified paper? 2 hours
  
 
How much time did you spend reading the summary of the old paper? 10 minutes
 
How much time did you spend reading the summary of the old paper? 10 minutes
Line 16: Line 22:
 
Was there a study plan for the old paper? Yes
 
Was there a study plan for the old paper? Yes
  
if so, did you read any of the items suggested by the study plan? and how much time did you spend with reading them? No.
+
If so, did you read any of the items suggested by the study plan? and how much time did you spend with reading them? No.
  
 
Give us any additional feedback you might have about this assignment.
 
Give us any additional feedback you might have about this assignment.

Latest revision as of 19:58, 2 November 2012

Big Idea

The motivation of the two papers are the same. That is there are a large amount of resources in English and they want to use them in target language where there are fewer resources.

Method

The main difference of the two paper is that, although they all use the parallel corpus 1) Rada Mihalcea use it in a way to project the word in the source language directly to the target language, and use the information (annotation) of the source language word to label the target language word. 2)While BinLu train a MaxEnt-based model then maximizes the joint probability, in which they consider both the language at the same time which is a more promising way in my opinion.

Problem

The problem they deal with is quite similar. Although Rada Mihalcea's work more focused on subjective analysis while the BinLu's work more focused on sentiment analysis on sentence level. They big problem is quite the same. They both want to draw lexicons (mainly) from the source language, and label the target language lexicons which contains information in dealing with the specific task.

Other

Although they all achieve good results in experiments, Rada Mihalcea and BinLu suffers from a same problem which is that, no matter using the dictionary based methods like in Rada Mihalcea's first apporach, or using the parallel corpus (used for alignment in Rada's and use google translation) the lexicon is quite limited actually. Which means, use can get a substantial amount of useful lexicons, but quite many lexicons which are actually very important for the specific task got ignored since they even do not appear in the auto-translation text or the standard dictionary. This will be even more harmful than the ambiguity problem.

Additional Questions

How much time did you spend reading the (new, non-wikified) paper you summarized? 3 hours

How much time did you spend reading the old wikified paper? 2 hours

How much time did you spend reading the summary of the old paper? 10 minutes

How much time did you spend reading background materiel? Half an hour

Was there a study plan for the old paper? Yes

If so, did you read any of the items suggested by the study plan? and how much time did you spend with reading them? No.

Give us any additional feedback you might have about this assignment.