Difference between revisions of "Comparison of Birke07 and Birke06"

From Cohen Courses
Jump to navigationJump to search
 
(5 intermediate revisions by the same user not shown)
Line 4: Line 4:
  
 
== Big Idea ==
 
== Big Idea ==
The (Birke 07) paper is the further development of (Birke 06).
+
* The (Birke 07) paper is the further development of (Birke 06). In (Birke 06), they use the '''literal feedback seed''' and '''non-literal feedback seed set''' to do the word sense disambiguation of the verb in a given sentence, and thus distinguish the non-literal uses from literal uses. However, the result in (Birke 06) is apparently noisy that they adopted some heuristic rules and a voting mechanism to clean the data. In (Birke 07), they use '''active learning''' instead of heuristic rules or voting, so that the human annotation can more effectively improve the result but not only "clean" them.
  
== Methodology ==
+
== Comparison ==
Both the papers have build a [[UsesMethod::Logistic regression]] model using an extensive list of features.
+
* (Birke 06), clustering without active learning, the models obtains an f-score of 53.8%; (Birke 07), the model adopted active learning obtains 64.9%.
<p>Anderson et al have proposed the solutions of two problems - Predicting the long term value of a question and Predicting whether a question has been sufficiently answered. There are features which are common to both the problems, as well as features specific to the problem.</p>
+
* (Birke 07) and (Birke 06) actually work on almost same task, same data set, and same evaluation metrics and method. The only concern for me is that the active learning in (Birke 07) actually involves in much more human annotated gold-standard annotation than the method of (Birke 06). I'm not quite sure this comparison is really fear.
<p>
 
Liu et al have divided the task into three subtasks - query clarity, query-answer match and answer satisfaction. It has defined features pertaining to each task. Further, they have defined two forms of Logisitc Regression. - Direct and Combined. In Direct Logistic Regression they combine all the features of all the three subtasks and come up with one model. In Composite Logistic Regression the model for each subtask is trained separately and then combined to come up with one model.
 
</p>
 
 
 
== Dataset ==
 
Anderson et al have used [[UsesDataset::Stack Overflow|Stack Overflow Data ]] for the study. Liu et al have used [[UsesDataset::Click_Dataset_Google_Yahoo|Click Dataset on Google search leading to Yahoo! Answers]].
 
 
 
== Evaluation ==
 
 
 
The evaluation metric of both the papers are different. Where Anderson et al measures the accuracy and Area under the ROC curve  ( AUC), Liu et al uses Correlation and RMSE as the evaluation metric.
 
Anderson et al have stated that their metric is close to the ground truth, Liu et al states that they their methodology gives solves a novel problem with a high correlation with the anwers of the human judges.  
 
  
 
== Other Questions ==
 
== Other Questions ==
#How much time did you spend reading the (new, non-wikified) paper you summarized? 2.5 hours.
+
* How much time did you spend reading the (new, non-wikified) paper you summarized?
#How much time did you spend reading the old wikified paper? 45 mins.
+
** 4 hours.
#How much time did you spend reading the summary of the old paper? 1 hour.
+
* How much time did you spend reading the old wikified paper?
#How much time did you spend reading background material? Since the problem is very close to my project problem, I have spent a lot of time reading material about the CQA in general.
+
** 30 mins.
#Was there a study plan for the old paper? Yes
+
* How much time did you spend reading the summary of the old paper?
##if so, did you read any of the items suggested by the study plan? and how much time did you spend with reading them? Yes I read the terms mentioned in the study plan. It took me 30 mins.
+
** 1.5 hour.
#Give us any additional feedback you might have about this assignment.
+
* How much time did you spend reading background material?
It was a different exercise to first write the own summary for the 2nd paper and then, read the summary of the first paper, and then go through the original first paper to fill the missing points. The original paper being a case study of Stack Overflow was quite long and involved quite a lot of details. The summary provided me a fairly good jist of the content of the paper.
+
** The problem is very relevant to my own research project, so not much, only about 1 hour.
 +
* Was there a study plan for the old paper?
 +
** Yes
 +
*** if so, did you read any of the items suggested by the study plan? and how much time did you spend with reading them?
 +
**** Not really the same paper, but I took a look at some other word sense disambiguation papers. Totally maybe about 1.5 hour.
 +
* Give us any additional feedback you might have about this assignment.
 +
** This is a good assignment let us really understand what those papers about.

Latest revision as of 11:01, 8 November 2012

Papers

  1. Active learning for the identification of nonliteral language (Birke07)
  2. A Clustering Approach for the Nearly Unsupervised Recognition of Nonliteral Language (Birke06)

Big Idea

  • The (Birke 07) paper is the further development of (Birke 06). In (Birke 06), they use the literal feedback seed and non-literal feedback seed set to do the word sense disambiguation of the verb in a given sentence, and thus distinguish the non-literal uses from literal uses. However, the result in (Birke 06) is apparently noisy that they adopted some heuristic rules and a voting mechanism to clean the data. In (Birke 07), they use active learning instead of heuristic rules or voting, so that the human annotation can more effectively improve the result but not only "clean" them.

Comparison

  • (Birke 06), clustering without active learning, the models obtains an f-score of 53.8%; (Birke 07), the model adopted active learning obtains 64.9%.
  • (Birke 07) and (Birke 06) actually work on almost same task, same data set, and same evaluation metrics and method. The only concern for me is that the active learning in (Birke 07) actually involves in much more human annotated gold-standard annotation than the method of (Birke 06). I'm not quite sure this comparison is really fear.

Other Questions

  • How much time did you spend reading the (new, non-wikified) paper you summarized?
    • 4 hours.
  • How much time did you spend reading the old wikified paper?
    • 30 mins.
  • How much time did you spend reading the summary of the old paper?
    • 1.5 hour.
  • How much time did you spend reading background material?
    • The problem is very relevant to my own research project, so not much, only about 1 hour.
  • Was there a study plan for the old paper?
    • Yes
      • if so, did you read any of the items suggested by the study plan? and how much time did you spend with reading them?
        • Not really the same paper, but I took a look at some other word sense disambiguation papers. Totally maybe about 1.5 hour.
  • Give us any additional feedback you might have about this assignment.
    • This is a good assignment let us really understand what those papers about.