Bbd writeup of Jansche and Abney

From Cohen Courses
Revision as of 10:42, 3 September 2010 by WikiAdmin (talk | contribs) (1 revision)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

This is a review of Jansche_2002_information_extraction_from_voicemail_transcripts by user:bbd.

I liked

  • They do the information extraction tasks in 2 phases. First phase helps in increasing recall and second phase in increasing precision. I liked the idea of separate optimization of precision and recall.
  • They have come up with some really useful hand crafted rules related to position cues and telephone number lengths which perform really good for the specific task and data set.

I didn't like

  • They mention that trigram based techniques proposed by Huang, may overfit the data and wont be generalizable. I feel the features used in this paper are also very specific to dataset they are dealing with. Since phone numbers in different countries may have different properties. Also the context features for caller extraction may be different in different places and languages. Hence good precision and recall can't be guaranteed.