Topic segmentation is the process of dividing written text into meaningful topics. In corpora such as transcripts of streaming audio, this task is non trivial as the corpora would not have explicit representation of a document or even a clear demarcations of where document breaks occur. Furthermore, a document may contain multiple topics, and the task of computerized text segmentation may be to discover these topics automatically and segment the text accordingly.

