AOL query log dataset

From Cohen Courses
Revision as of 14:19, 20 April 2010 by PastStudents (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

To build this dataset, first the most frequent 1050 queries were selected from the AOL query log. To make the dataset divergence enough another 1050 queries were also sampled randomly from the AOL query log according to their relative frequency. Most of these queries are in English. Finally for each query, the top 500 results returned by Google, Yahoo!, or MSN were retained as seeds.

Link: AOL query log dataset

Relevant Papers