Difference between revisions of "Compare Ku Akcora"

From Cohen Courses
Jump to navigationJump to search
(Created page with ' ==Two Papers== http://malt.ml.cmu.edu/mw/index.php/Akcora_et_al,_SOMA_2010 http://malt.ml.cmu.edu/mw/index.php/L._Ku,_Y._Liang,_and_H._Chen._Opinion_extraction,_summarization_…')
 
 
(2 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
 
==Two Papers==
 
==Two Papers==
  
 +
1
 
http://malt.ml.cmu.edu/mw/index.php/Akcora_et_al,_SOMA_2010
 
http://malt.ml.cmu.edu/mw/index.php/Akcora_et_al,_SOMA_2010
  
http://malt.ml.cmu.edu/mw/index.php/L._Ku,_Y._Liang,_and_H._Chen._Opinion_extraction,_summarization_and_tracking_in_news_and_blog_corpora._In_Proceedings_of_AAAI-2006#Opinion_Tracking
+
2 http://malt.ml.cmu.edu/mw/index.php/L._Ku,_Y._Liang,_and_H._Chen._Opinion_extraction,_summarization_and_tracking_in_news_and_blog_corpora._In_Proceedings_of_AAAI-2006#Opinion_Tracking
  
 
== Problem ==
 
== Problem ==
Line 22: Line 22:
  
 
Both of the paper are trying to extract opinion from the text data. Second one restricts in binary options(positive and negative), while the second one enables multiple dimensions of the opinion. The concern from my personal opinion is it is hard to evaluate the summarization or opinion breakpoints obtain from both the papers. All these concepts are relatively subject.
 
Both of the paper are trying to extract opinion from the text data. Second one restricts in binary options(positive and negative), while the second one enables multiple dimensions of the opinion. The concern from my personal opinion is it is hard to evaluate the summarization or opinion breakpoints obtain from both the papers. All these concepts are relatively subject.
 +
 +
 +
== Six Questions==
 +
 +
How much time did you spend reading the (new, non-wikified) paper you summarized?
 +
4 hours
 +
 +
How much time did you spend reading the old wikified paper?
 +
3 hours(it is quite short)
 +
 +
How much time did you spend reading the summary of the old paper?
 +
30 min
 +
 +
How much time did you spend reading background materiel?
 +
15 min. Because I already went through the background materiel in our project preparing process.
 +
 +
Was there a study plan for the old paper?
 +
No
 +
 +
Give us any additional feedback you might have about this assignment.
 +
I select this pair because they are talking about over time change, which is quite interesting and related to my own project. But later I found they are quite obsolete, there are new papers on this topics and have more comprehensive analysis and better setting for the experiment. For example this one: http://www.bradblock.com/Topics_over_Time_A_Non_Markov_Continuous_Time_Model_of_Topical_Trends.pdf

Latest revision as of 14:45, 2 November 2012

Two Papers

1 http://malt.ml.cmu.edu/mw/index.php/Akcora_et_al,_SOMA_2010

2 http://malt.ml.cmu.edu/mw/index.php/L._Ku,_Y._Liang,_and_H._Chen._Opinion_extraction,_summarization_and_tracking_in_news_and_blog_corpora._In_Proceedings_of_AAAI-2006#Opinion_Tracking

Problem

Both of the paper addressed the problem of detecting opinion from text data. They both covered opinion tracking over a time period. The first paper focused on identity the breakpoints of the opinion changing while the second paper is more about getting the opinion summarized to in polarities, whether it is negative or positive.

Algorithm

The algorithm used in these two papers are totally different, since the goal is different.The first paper used two types of measure to detect the changing of the topic opinion, one is vector space model and the other is set space model. The second paper used a weighting scheme to determine the polarity of the word in Chinese, which is quite complicated compared other language like English. Because word are composed of characters which can have different meanings separately. However, what's worth mention is that both of the utilized the popular TF_IDF scheme in detecting key words. In the first paper, TF_IDF was modified to accumulate along time. In the second paper,TF_IDF is used as the basic component as identify if a sentence contain the key word that related to the topic, which then can be evaluated to see whether to added it to the fusion of the sentiment accumulation of the document or not.

Data set

The first papers used the tweets collected according Tiger Woods, November 27,2009 car accident. And the second paper in the opinion tracking part used the NTCIR corpus talked about 2000 Taiwan president election. Both of them have very clear sentiment difference.


Comments

Both of the paper are trying to extract opinion from the text data. Second one restricts in binary options(positive and negative), while the second one enables multiple dimensions of the opinion. The concern from my personal opinion is it is hard to evaluate the summarization or opinion breakpoints obtain from both the papers. All these concepts are relatively subject.


Six Questions

How much time did you spend reading the (new, non-wikified) paper you summarized? 4 hours

How much time did you spend reading the old wikified paper? 3 hours(it is quite short)

How much time did you spend reading the summary of the old paper? 30 min

How much time did you spend reading background materiel? 15 min. Because I already went through the background materiel in our project preparing process.

Was there a study plan for the old paper? No

Give us any additional feedback you might have about this assignment. I select this pair because they are talking about over time change, which is quite interesting and related to my own project. But later I found they are quite obsolete, there are new papers on this topics and have more comprehensive analysis and better setting for the experiment. For example this one: http://www.bradblock.com/Topics_over_Time_A_Non_Markov_Continuous_Time_Model_of_Topical_Trends.pdf