Topic Segmentation

From Cohen Courses
Revision as of 01:20, 27 March 2011 by Dwijaya (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Topic segmentation is the process of dividing written text into meaningful topics. In corpora such as transcripts of streaming audio, this task is non trivial as the corpora would not have explicit representation of a document or even a clear demarcations of where document breaks occur. Furthermore, a document may contain multiple topics, and the task of computerized text segmentation may be to discover these topics automatically and segment the text accordingly.

External Link

Relevant Papers