Cascading Behavior in Large Blog Graphs

From Cohen Courses
Revision as of 21:15, 5 November 2012 by Tahoang (talk | contribs) (Created page with 'This is a scientific paper authored by [http://cs.stanford.edu/people/jure/ Leskovec] et. al. and appeared in SDM 2007. Below is the paper summary written by [http://malt.ml.cmu.…')
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

This is a scientific paper authored by Leskovec et. al. and appeared in SDM 2007. Below is the paper summary written by Tuan Anh.

Citation

@proceedings{DBLP:conf/sdm/2007,

 title     = {Proceedings of the Seventh SIAM International Conference
              on Data Mining, April 26-28, 2007, Minneapolis, Minnesota,
              USA},
 booktitle = {SDM},
 publisher = {SIAM},
 year      = {2007},
 bibsource = {DBLP, http://dblp.uni-trier.de}

}

Online Version

Cascading Behavior in Large Blog Graphs

Summary

This paper reports empirical studies on how information diffuses in blog-sphere and propose a model for modeling the diffusion process. By representing the diffusion step of information from one post to other post, or from one blog to other blog, by citing link from the later to the former, the authors track the diffusion chain by sets of citing subgraphs. The authors then conduct some statistics on the set of citing subgraphs. Their key findings includes:

  • The number of blog posts per day has a weekly periodicity.
  • The popularity of posts, which is measures by number of citing posts, drops with a power law of number of days after the post
  • The blog network follows power law distribution, and so does the post network
  • Most of citing subgraphs in post network, or "cascades" as they are called in the paper, have a tree-like shape with about 97% of cascades is trivial (isolated posts)
  • The size of cascade follows Zipf distribution
  • The number of cascades that a node (post) taking part in follows power law

Finally, the authors propose a model for modeling the cascades based on diffusion models in epidemiology, i.e., SIS model and its variations. They validate their models by comparing the above observed statistics and the ones obtained from simulating datasets which are generated by the models.

Dicussion

Related papers

There are some similar works on tracking diffusion on social media, e.g.,