Nschneid writeup of Jansche and Abney

From Cohen Courses
Jump to navigationJump to search

This is Nschneid's review of Jansche_2002_information_extraction_from_voicemail_transcripts

Given a corpus of transcripts of voicemail messages, the task of identifying the caller's name is best addressed with a log-linear tagging model which takes into account positional information (callers tend to identify themselves at the beginning of the message). The task of identifying the caller's number is best handled with a high-recall, rule-based system that identifies the digits implied by a sequence of number words, followed by a binary (decision tree) classifier whose features include the number of digits.

This paper did not strike me as especially surprising or enlightening, but perhaps it was a novel approach in 2002.