Text summarization
From Cohen Courses
Revision as of 15:59, 30 September 2010 by PastStudents (talk | contribs)
Contents
Summary
Text Summarization (also known as summarization, and automatic summarization) is a natural language processing task which focuses on creating shortened versions of texts with computer algorithms/software that retain the important points of the original piece of text.
Common Approaches
Common approaches to text summarization can typically be broken down into one of the following categories:
- Extraction, extracts most important information (sentences or paragraphs) from original text and copies them to make summary
- Abstraction, paraphrases sections in the original text and relies on language generation to make the summaries coherent
Challenges / Issues
Some major challenges in text summarization
Evaluation
One commonly used evaluation metric in summarization is ROUGE, which is used in NIST's Document Understanding Conferences summarization tasks.