Difference between revisions of "Network of Blogs"

From Cohen Courses
Jump to navigationJump to search
m (1 revision)
 
(No difference)

Latest revision as of 10:42, 3 September 2010

This is one of the datasets discussed in Social Media Analysis 10-802 in Spring 2010.

  • # Blogs = 45000
  • # Posts = 10500000
  • # Links = 16200000

This data set was used in the paper Cost Effective Outbreak Detection in Networks.

This dataset was generated by sampling from a much larger set of 2.5 million blogs. They only considered blogs that received at least 3 in-links in the first 6 months of 2006 and then took all their posts for the full year. Posts have rich metadata, including time stamps, which allows extraction of information cascades.