MemeTracker

This is a dataset consisting of 343 million short textual phrases collected from online blogs with timestamps. A cascade is considered as a phrase cluster over the aggregated different textual variants of the same phrase, and it is simply a set of time-stamps when a phrase is mentioned in the blogs.