Difference between revisions of "User talk:Xxiong"
From Cohen Courses
Jump to navigationJump to searchPastStudents (talk | contribs) |
PastStudents (talk | contribs) |
||
Line 14: | Line 14: | ||
Contribution: | Contribution: | ||
− | * email-specific feature set | + | * email-specific feature set. |
+ | * The authors found that repetitions within single document are more often in newwires while repetitions occurred in multiple files are more often in emails. | ||
+ | based on this discovery. | ||
− | + | Recall-enhancing Techniques: | |
+ | * single document repetition (SDR): mark repeated tokens within a single document as a name. | ||
+ | * multiple document repetition (MDR): mark repeated tokens appearing in multiple documents as a name. | ||
== Related papers == | == Related papers == | ||
* '''Search-based Structured Prediction''': This is the journal version of the paper that introduces the [[UsesMethod::SEARN]] algorithm - [[RelatedPaper::Daume_et_al,_ML_2009]]. | * '''Search-based Structured Prediction''': This is the journal version of the paper that introduces the [[UsesMethod::SEARN]] algorithm - [[RelatedPaper::Daume_et_al,_ML_2009]]. |
Revision as of 14:55, 8 October 2010
Citation
Einat Minkov, Richard C. Wang & William W. Cohen, Extracting Personal Names from Emails: Applying Named Entity Recognition to Informal Text, in HLT/EMNLP 2005
Online version
Extracting Personal Names from Emails
Summary
Task: extract person names from emails
Techniques: treating NER as tagging. CRF model is used for this task.
Contribution:
- email-specific feature set.
- The authors found that repetitions within single document are more often in newwires while repetitions occurred in multiple files are more often in emails.
based on this discovery.
Recall-enhancing Techniques:
- single document repetition (SDR): mark repeated tokens within a single document as a name.
- multiple document repetition (MDR): mark repeated tokens appearing in multiple documents as a name.
Related papers
- Search-based Structured Prediction: This is the journal version of the paper that introduces the SEARN algorithm - Daume_et_al,_ML_2009.