Difference between revisions of "Guralnik 99"
m |
|||
Line 24: | Line 24: | ||
== Summary == | == Summary == | ||
− | === | + | === Task Definition=== |
+ | *Develop a general approach to change-point detection that generalize across wide range of application | ||
− | === | + | === Method === |
− | + | ==== Batch Algorithm ==== | |
− | + | *Algorithm overview | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | === | ||
− | |||
− | + | The algorithm takes the set of approximating basis functions MSet | |
+ | and the time series T | ||
− | + | # new-change-point = find-candidate(T, MSet) | |
+ | # Change-Points = <math>\phi</math> | ||
+ | # Candidates = <math>\phi</math> | ||
+ | # Tl, Tz = get-new-time-ranges(T, Change-Points, new-change-point) | ||
+ | # '''while'''(stopping criteria is not met) do begin | ||
+ | ## cl = find-candidate(T1, MSet) | ||
+ | ## c2 = find-andidate(T2, MSet) | ||
+ | ## Candidates = Candidates <math>\cup c_1</math> | ||
+ | ##Candidates = Candidates <math>\cup c_2</math> | ||
+ | ##new-change-point = c <math>\in</math> Candidates |Q(Change-Points,c) = min | ||
+ | ##Candidates = Candidates \ new-change-point | ||
+ | ##Tl,T2 = get-new-time-ranges(T, Change-Points, new-change-point) | ||
+ | ##Change-Points = Change-Points <math>\cup</math> new-change-points | ||
+ | #'''end''' | ||
− | + | *Stopping Criteria | |
− | + | If in iterations k and k+1 the respective likelihood criteria, the algorithm should stop if the difference of the likelihood proportioned to the last step likelihood is below a small constant. | |
− | + | *Experiment with Traffic Data | |
+ | The data used in our experiments was taken from | ||
+ | highway traffic sensors, called loop detectors, in the Minneapolis-St. Paul metro area. A loop detector is a | ||
+ | sensor, embedded in the road, with an electro-magnetic | ||
+ | field around it, which is broken when a vehicle goes | ||
+ | over it. Each such breaking, and the subsequent reestablishment, | ||
+ | of the field is recorded as a count. Traffic | ||
+ | volume is defined as the vehicle count per unit time. We need to find the change point detection | ||
+ | algorithm performed compared to a person doing | ||
+ | the same task through visual inspection. | ||
+ | * The results seems a little week. It has no ground truth. Statistically speaking it does performs better than 4 human subject. | ||
− | + | ==== Incremental Algorithm ==== | |
+ | Because the next data point collected by the sensor reflects significant change in phenomenon, then its likelihood criteria of being a change point is going to b smaller than likelihood criteria that it is not. However, if the difference in likelihood is small, it can just be noise. So the incremental algorithm change the stopping criteria to | ||
+ | * The no_change period and change period likelihood difference is below a small constant times of the no_change period likelihood. | ||
+ | |||
+ | =====Performance Evaluation===== | ||
+ | Not as good as the batch model, because it is local optimum since the future coming data is not observed | ||
+ | |||
+ | == Interesting points == | ||
+ | It is a non-Bayesian model, hence prior model doesn't require. It is very easy to implement and it considers the time span which is very important, but since it is a quite old paper, the experiments are obsolete. | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
== Related Papers == | == Related Papers == |
Revision as of 22:07, 1 October 2012
Contents
Citation
@inproceedings{:conf/kdd/GuralnikS99,
author = {Valery Guralnik and Jaideep Srivastava}, title = {Event Detection from Time Series Data}, booktitle = {KDD}, year = {1999}, pages = {33-42}, ee = {http://doi.acm.org/10.1145/312129.312190}, bibsource = {http://dblp.uni-trier.de}
}
Abstract from the paper
Online version
Summary
Task Definition
- Develop a general approach to change-point detection that generalize across wide range of application
Method
Batch Algorithm
- Algorithm overview
The algorithm takes the set of approximating basis functions MSet and the time series T
- new-change-point = find-candidate(T, MSet)
- Change-Points =
- Candidates =
- Tl, Tz = get-new-time-ranges(T, Change-Points, new-change-point)
- while(stopping criteria is not met) do begin
- cl = find-candidate(T1, MSet)
- c2 = find-andidate(T2, MSet)
- Candidates = Candidates
- Candidates = Candidates
- new-change-point = c Candidates |Q(Change-Points,c) = min
- Candidates = Candidates \ new-change-point
- Tl,T2 = get-new-time-ranges(T, Change-Points, new-change-point)
- Change-Points = Change-Points new-change-points
- end
- Stopping Criteria
If in iterations k and k+1 the respective likelihood criteria, the algorithm should stop if the difference of the likelihood proportioned to the last step likelihood is below a small constant.
- Experiment with Traffic Data
The data used in our experiments was taken from highway traffic sensors, called loop detectors, in the Minneapolis-St. Paul metro area. A loop detector is a sensor, embedded in the road, with an electro-magnetic field around it, which is broken when a vehicle goes over it. Each such breaking, and the subsequent reestablishment, of the field is recorded as a count. Traffic volume is defined as the vehicle count per unit time. We need to find the change point detection algorithm performed compared to a person doing the same task through visual inspection.
- The results seems a little week. It has no ground truth. Statistically speaking it does performs better than 4 human subject.
Incremental Algorithm
Because the next data point collected by the sensor reflects significant change in phenomenon, then its likelihood criteria of being a change point is going to b smaller than likelihood criteria that it is not. However, if the difference in likelihood is small, it can just be noise. So the incremental algorithm change the stopping criteria to
- The no_change period and change period likelihood difference is below a small constant times of the no_change period likelihood.
Performance Evaluation
Not as good as the batch model, because it is local optimum since the future coming data is not observed
Interesting points
It is a non-Bayesian model, hence prior model doesn't require. It is very easy to implement and it considers the time span which is very important, but since it is a quite old paper, the experiments are obsolete.
Related Papers
- T. Sakaki, M. Okazaki, and Y. Matsuo. Earthquake shakes Twitter users: real-time event detection by social sensors.
In Proceedings of the 19th international conference on World wide web, WWW ’10, pages 851–860, New York, NY, USA, April 2010. ACM
- J. Sankaranarayanan, H. Samet, B. E. Teitler, M. D.Lieberman, and J. Sperling. TwitterStand: news in tweets. In GIS ’09: Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pages 42–51, New York, NY, USA, November 2009. ACM Press.
- A Statistical Model for Popular Events Tracking in Social Communities. Lin et al, KDD 2011 This paper address a method to observe and track the popular events or topics that evolve over time in the communities.
- A study on retrospective and online event detection. Yang et al, SIGIR 98 This paper addresses the problems of detecting events in news stories.
- Temporal and information flow based event detection from social text streams. Zhao et al, AAAI 07 This paper addresses the problems of detecting events in news stories.
- Automatic Detection and Classification of Social Events. Agarwal and Rambow, ACL 10 This paper aims at detecting and classifying social events using Tree kernels.
- Detecting controversial events from Twitter. Popescu and Pennacchiotti, CIKM 10 This paper addresses the task of identifying controversial events using Twitter as a starting point.
- Information credibility on twitter. Castillo et al, WWW 11 The authors develop a general approach to change-point detection that generalize across wide range of application.