Difference between revisions of "Comparison Das et al WSDM 2011 and Zhao et al AAAI 2007"

From Cohen Courses
Jump to navigationJump to search
m
m
Line 10: Line 10:
  
 
On a high level, both papers are interested in discovering events from large amount temporal information sources.
 
On a high level, both papers are interested in discovering events from large amount temporal information sources.
Both of them leverage on user generated content, with Anish et al using Wikipedia as their dataset, while Zhao et al used the [[UsesDataset::Enron email corpus]] and [[UsesDataset::Dailykos political blogs]].
+
Both of them leverage on user generated content, with Anish et al using Wikipedia as their dataset, while Zhao et al used the [[UsesDataset::Enron email corpus]] and [[UsesDataset::Dailykos blogs]].
  
 
== Evaluation ==
 
== Evaluation ==

Revision as of 22:19, 5 November 2012

This is a comparison of two related papers in event detection and temporal information extraction.

Papers

The papers are

Comparative analysis of both papers

On a high level, both papers are interested in discovering events from large amount temporal information sources. Both of them leverage on user generated content, with Anish et al using Wikipedia as their dataset, while Zhao et al used the Enron email corpus and Dailykos blogs.

Evaluation

They used the Enron email corpus and Dailykos blogs [3]. 30 events are manually labeled as ground truth in the dataset by looking for correspondance with real world news.

Performance is measured using precision/recall/fscore of how well events are recovered with their model.

Discussion

They found that taking temporal and social dimensions into account can increase their f-score significantly. Their approach of integrating these diverse features together in a step-wise manner was also found to perform better than just including features in a standard machine learning framework.

Related papers

There has been a lot of work on event detection.

Study plan

  • Article: Adaptive time series model [4]
  • Graph cut based clustering [5]