Difference between revisions of "Yang et al, SIGIR 98"

From Cohen Courses
Jump to navigationJump to search
m (Created page with 'This [[Category::Paper]] is relevant to our project on detecting controversial events in Twitter. == Citation == Yiming Yang, Thomas Pierce, and Jaime Carbonell. A study on ret…')
 
m
 
(8 intermediate revisions by the same user not shown)
Line 1: Line 1:
This [[Category::Paper]] is relevant to our project on detecting controversial events in Twitter.
+
= A study on retrospective and online event detection =
 +
 
 +
This [[Category::Paper]] is relevant to [[AddressesProblem::Controversial_events_detection|detecting controversial events]].
  
 
== Citation ==
 
== Citation ==
Line 11: Line 13:
 
== Summary ==
 
== Summary ==
  
This paper addresses the problems of detecting events in news stories.  
+
This paper addresses the problems of [[AddressesProblem::Event_detection|detecting events]] in news stories.  
 
They present solutions for retrospective event detection and online event detection using [[UsesMethod::clustering]] techniques: [[UsesMethod::group average clustering]] and [[UsesMethod::single pass clustering]].
 
They present solutions for retrospective event detection and online event detection using [[UsesMethod::clustering]] techniques: [[UsesMethod::group average clustering]] and [[UsesMethod::single pass clustering]].
 
They addressed the problem of the streaming nature of their data by doing incremental IDF, where the IDF values of terms in the corpus is incrementally updated as a new document is observed.
 
They addressed the problem of the streaming nature of their data by doing incremental IDF, where the IDF values of terms in the corpus is incrementally updated as a new document is observed.
Line 17: Line 19:
 
They also tried reweighting similarity scores according to the temporal proximity of two documents.
 
They also tried reweighting similarity scores according to the temporal proximity of two documents.
  
They experimented with the Topic Detection and Tracking corpus.
+
They experimented with the [[UsesDataset::Topic Detection and Tracking]] corpus.
  
 
== Evaluation ==
 
== Evaluation ==
Line 32: Line 34:
 
== Related papers ==
 
== Related papers ==
 
There has been a lot of work on event detection.
 
There has been a lot of work on event detection.
* [[RelatedPaper::Lin_et_al_KDD_2011]] This paper address a method to observe and track the popular events or topics that evolve over time in the communities.
+
* [[RelatedPaper::Lin_et_al_KDD_2011|A Statistical Model for Popular Events Tracking in Social Communities. Lin et al, KDD 2011]] This paper address a method to observe and track the popular events or topics that evolve over time in the communities.
* [[UsesMethod::Popular_Event_Tracking]] A method that take both interest and network structure into account.
+
* [[RelatedPaper::Popescu and Pennacchiotti, CIKM 10|Detecting controversial events from Twitter. Popescu and Pennacchiotti, CIKM 10]] This paper addresses the task of identifying controversial events using Twitter as a starting point.
* [[RelatedPaper::Automatic_Detection_and_Classification_of_Social_Events]] This paper aims at detecting and classifying social events using Tree kernels.
+
* [[RelatedPaper::Zhao_et_al,_AAAI_07|Temporal and information flow based event detection from social text streams. Zhao et al, AAAI 07]] The authors proposes a method for detecting events from social text stream by exploiting more than just the textual content, but also exploring the temporal and social dimensions of their data.
 +
* [[UsesMethod::Popular_Event_Tracking|Popular Event Tracking]] A method that take both interest and network structure into account.
 +
* [[RelatedPaper::Automatic_Detection_and_Classification_of_Social_Events|Automatic Detection and Classification of Social Events]] This paper aims at detecting and classifying social events using Tree kernels.
  
 
== Study plan ==
 
== Study plan ==
 
* Article: Group average agglomerative clustering [http://nlp.stanford.edu/IR-book/html/htmledition/group-average-agglomerative-clustering-1.html]
 
* Article: Group average agglomerative clustering [http://nlp.stanford.edu/IR-book/html/htmledition/group-average-agglomerative-clustering-1.html]
 
* Article: Single pass clustering [http://orion.lcg.ufrj.br/Dr.Dobbs/books/book5/chap16.htm]
 
* Article: Single pass clustering [http://orion.lcg.ufrj.br/Dr.Dobbs/books/book5/chap16.htm]

Latest revision as of 00:01, 1 October 2012

A study on retrospective and online event detection

This Paper is relevant to detecting controversial events.

Citation

Yiming Yang, Thomas Pierce, and Jaime Carbonell. A study on retrospective and online event detection. In Proc. ACM SIGIR, pages 28–36, Melbourne, 1998.

Online version

A study on retrospective and online event detection

Summary

This paper addresses the problems of detecting events in news stories. They present solutions for retrospective event detection and online event detection using clustering techniques: group average clustering and single pass clustering. They addressed the problem of the streaming nature of their data by doing incremental IDF, where the IDF values of terms in the corpus is incrementally updated as a new document is observed. Furthermore, they use a time window to limit the search space for similar news events to the last m received stories. They also tried reweighting similarity scores according to the temporal proximity of two documents.

They experimented with the Topic Detection and Tracking corpus.

Evaluation

They evaluated the ability of their systems to recover news events retrspectively, and also in an online setting. They compared their system's performance to human judgements for two specific events to analyse the behaviour of their algorithm.

Discussion

This paper presents a bag-of-words clustering approach to detecting new events in a news corpus. They showed how online detection is a more difficult problem than retrospective detection. This paper poses two important social problems related to bipartite social graphs and explained how those problems can be solved efficiently using random walks.

Related papers

There has been a lot of work on event detection.

Study plan

  • Article: Group average agglomerative clustering [1]
  • Article: Single pass clustering [2]