Topic Detection and Tracking

From Cohen Courses
Revision as of 20:18, 26 September 2012 by Ysim (talk | contribs)
Jump to navigationJump to search

This dataset is used for the Topic Detection and Tracking task hosted by NIST [1].

Annotation guidelines are available here.

This dataset contains 407, 505 news articles in Arabic, Mandarin and English. The news articles are annotated for topics, events and activities.

From the annotation guidelines:

A TDT event is defined as a particular thing that happens at a specific time and place, along with all necessary preconditions and unavoidable consequences.  A TDT event might be a particular plane crash, or a single meeting, or a particular court hearing.  An activity is a connected set of events that have a common focus or purpose, happening at a specific place and time; for instance, a campaign, or an investigation, or a disaster relief effort.  For the purposes of TDT, a topic is defined as an event or activity, along with all directly related events and activities.  

Relevant Papers