Selen writeup of Jansche and Abney
This is a review of Jansche_2002_information_extraction_from_voicemail_transcripts by user:Selen.
In this paper, they extract caller identity from voice mail transcripts. They show that it is possible to achieve high accuracies while using less common words and names as features, and using positional cues.
What I like about this paper is that they divide the task into smaller subtasks: i.e. when extracting phone numbers they use a two step approach.
What I don't like about this paper is that they don't give details about other approaches such as what kind of hand crafted grammatical rules?. Also they compare Col log linear with their method and HZP on manual transcripts but they don't compare Col log linear with automatic transcriptions. They also should have done better in manual transcriptions, to make this paper more plausible.