User talk:Xxiong
From Cohen Courses
Citation
Einat Minkov, Richard C. Wang & William W. Cohen, Extracting Personal Names from Emails: Applying Named Entity Recognition to Informal Text, in HLT/EMNLP 2005
Online version
Extracting Personal Names from Emails
Summary
Task: extract person names from emails
Techniques: treating NER as tagging. CRF model is used for this task.
Contribution:
- email-specific feature set.
- The authors found that repetitions within single document are more often in newwires while repetitions occurred in multiple files are more often in emails.
based on this discovery.
Recall-enhancing Techniques:
- single document repetition (SDR): mark repeated tokens within a single document as a name.
- multiple document repetition (MDR): mark repeated tokens appearing in multiple documents as a name.
Related papers
- Search-based Structured Prediction: This is the journal version of the paper that introduces the SEARN algorithm - Daume_et_al,_ML_2009.