TREC BLOG06

From Cohen Courses
Revision as of 02:57, 31 March 2011 by Reyyan (talk | contribs) (Created page with 'BLOG06 is a TREC test collection which has been created and distributed by the University of Glasgow. The dataset contains feeds, permalinks and homepages over an 11 weeks peri…')
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

BLOG06 is a TREC test collection which has been created and distributed by the University of Glasgow.

The dataset contains feeds, permalinks and homepages over an 11 weeks period.

  • 100,649 feeds
  • 3,215,171 permalinks
  • 324,880 homepages

17,969 spam blogs were added to the corpus in order to make it more realistic.

More information about the dataset can be found at RelatedPaper:Macdonald and Ounis 2006