FBIS corpus

From Cohen Courses
Revision as of 02:52, 2 November 2011 by Aanavas (talk | contribs) (Created page with 'The FBIS corpus is a collection of radio news casts and includes [[Category::dataset|datasets]] of parallel text in multiple languages. For example, the Chinese-English parallel …')
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

The FBIS corpus is a collection of radio news casts and includes datasets of parallel text in multiple languages. For example, the Chinese-English parallel corpus contains 237.6 million English words and 215.4 million Chinese words.

Foreign Broadcast Information Service (FBIS) was an open source intelligence component of the Central Intelligence Agency's Directorate of Science and Technology. It monitored, translated, and disseminated within the U.S. government openly available news and information from media sources outside the United States.


Relevant Papers