NTCIR-6 Opinion

From Cohen Courses
Jump to navigationJump to search

This is one of the Dataset.

The collection consists of document data (Mainichi Newspaper 1998-2001 (Japanese), Yomiuri Newspaper 1998-2001 (Japanese), CIRB020 1998-1999 + CIRB040 2000-2001 (Tradiational Chinese, various newspapers from Taiwan), Mainichi Daily 1998-2001 (English, published in Japan), Daily Yomiuri 2000-2001 (English, published in Japan), Korea Times 2000-2001 (Korean Newspaper), Hong Kong Standard 1998-1999 (English, published in Hong Kong), topics, and annotations. There are 32 topics ranging from 1998-2001, each in English, Chinese, and Japanese. The annotations assign opinion tags to sentences in the selected documents that are relevant to the topics. The documents that are annotated are separately distributed in a sentence-segmented format that aligns with the sentence numbering in the CSV annotation files.


  • # CIRB020 = 249,508 docs
  • # CIRB040 = 901,446 docs
  • # mainichi = 419,759 docs
  • # yomuri = 1,034,699 docs
  • # mainichi daily = 24,878 docs
  • # daily yomiuri = 17741 docs
  • # Korea Times = 30,530 docs
  • # Hong Kong Standard = 96,856 docs
  • # Xinhua = 406,792 docs


Relevant Papers