AOL query log dataset
From Cohen CoursesJump to navigationJump to search
To build this dataset, first the most frequent 1050 queries were selected from the AOL query log. To make the dataset divergence enough another 1050 queries were also sampled randomly from the AOL query log according to their relative frequency. Most of these queries are in English. Finally for each query, the top 500 results returned by Google, Yahoo!, or MSN were retained as seeds.
Link: AOL query log dataset