Network of Blogs
From Cohen Courses
Jump to navigationJump to searchThis is one of the datasets discussed in Social Media Analysis 10-802 in Spring 2010.
- # Blogs = 45000
- # Posts = 10500000
- # Links = 16200000
This data set was used in the paper Cost Effective Outbreak Detection in Networks.
This dataset was generated by sampling from a much larger set of 2.5 million blogs. They only considered blogs that received at least 3 in-links in the first 6 months of 2006 and then took all their posts for the full year. Posts have rich metadata, including time stamps, which allows extraction of information cascades.