Guralnik 99

From Cohen Courses
Revision as of 23:07, 1 October 2012 by Zhouyu (talk | contribs)
Jump to navigationJump to search

Castillo http://delivery.acm.org/10.1145/320000/312190/p33-guralnik.pdf?ip=128.237.122.250&acc=ACTIVE%20SERVICE&CFID=119212228&CFTOKEN=52277574&__acm__=1348531826_377333b00daa1db4fd36cb60f6bb28fb


Citation

@inproceedings{:conf/kdd/GuralnikS99,

 author    = {Valery Guralnik and
              Jaideep Srivastava},
 title     = {Event Detection from Time Series Data},
 booktitle = {KDD},
 year      = {1999},
 pages     = {33-42},
 ee        = {http://doi.acm.org/10.1145/312129.312190},
 bibsource = {http://dblp.uni-trier.de}

}


Abstract from the paper

Online version

pdf link to the paper

Summary

Task Definition

  • Develop a general approach to change-point detection that generalize across wide range of application

Method

Batch Algorithm

  • Algorithm overview

The algorithm takes the set of approximating basis functions MSet and the time series T

  1. new-change-point = find-candidate(T, MSet)
  2. Change-Points =
  3. Candidates =
  4. Tl, Tz = get-new-time-ranges(T, Change-Points, new-change-point)
  5. while(stopping criteria is not met) do begin
    1. cl = find-candidate(T1, MSet)
    2. c2 = find-andidate(T2, MSet)
    3. Candidates = Candidates
    4. Candidates = Candidates
    5. new-change-point = c Candidates |Q(Change-Points,c) = min
    6. Candidates = Candidates \ new-change-point
    7. Tl,T2 = get-new-time-ranges(T, Change-Points, new-change-point)
    8. Change-Points = Change-Points new-change-points
  6. end
  • Stopping Criteria

If in iterations k and k+1 the respective likelihood criteria, the algorithm should stop if the difference of the likelihood proportioned to the last step likelihood is below a small constant.

  • Experiment with Traffic Data

The data used in our experiments was taken from highway traffic sensors, called loop detectors, in the Minneapolis-St. Paul metro area. A loop detector is a sensor, embedded in the road, with an electro-magnetic field around it, which is broken when a vehicle goes over it. Each such breaking, and the subsequent reestablishment, of the field is recorded as a count. Traffic volume is defined as the vehicle count per unit time. We need to find the change point detection algorithm performed compared to a person doing the same task through visual inspection.

  • The results seems a little week. It has no ground truth. Statistically speaking it does performs better than 4 human subject.

Incremental Algorithm

Because the next data point collected by the sensor reflects significant change in phenomenon, then its likelihood criteria of being a change point is going to b smaller than likelihood criteria that it is not. However, if the difference in likelihood is small, it can just be noise. So the incremental algorithm change the stopping criteria to

  • The no_change period and change period likelihood difference is below a small constant times of the no_change period likelihood.
Performance Evaluation

Not as good as the batch model, because it is local optimum since the future coming data is not observed

Interesting points

It is a non-Bayesian model, hence prior model doesn't require. It is very easy to implement and it considers the time span which is very important, but since it is a quite old paper, the experiments are obsolete.



Related Papers

  • T. Sakaki, M. Okazaki, and Y. Matsuo. Earthquake shakes Twitter users: real-time event detection by social sensors.

In Proceedings of the 19th international conference on World wide web, WWW ’10, pages 851–860, New York, NY, USA, April 2010. ACM

  • J. Sankaranarayanan, H. Samet, B. E. Teitler, M. D.Lieberman, and J. Sperling. TwitterStand: news in tweets. In GIS ’09: Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pages 42–51, New York, NY, USA, November 2009. ACM Press.