Difference between revisions of "User talk:Xxiong"

From Cohen Courses
Jump to navigationJump to search
Line 14: Line 14:
  
 
Contribution:  
 
Contribution:  
* email-specific feature set
+
* email-specific feature set.
 +
* The authors found that repetitions within single document are more often in newwires while repetitions occurred in multiple files are more often in emails.
 +
based on this discovery.
  
repetitions within single document are more often in newwires while repetitions occurred in multiple files are more often in emails.
+
Recall-enhancing Techniques:
 +
* single document repetition (SDR): mark repeated tokens within a single document as a name.
 +
* multiple document repetition (MDR): mark repeated tokens appearing in multiple documents as a name.
  
 
== Related papers ==
 
== Related papers ==
  
 
* '''Search-based Structured Prediction''': This is the journal version of the paper that introduces the [[UsesMethod::SEARN]] algorithm - [[RelatedPaper::Daume_et_al,_ML_2009]].
 
* '''Search-based Structured Prediction''': This is the journal version of the paper that introduces the [[UsesMethod::SEARN]] algorithm - [[RelatedPaper::Daume_et_al,_ML_2009]].

Revision as of 15:55, 8 October 2010

Citation

Einat Minkov, Richard C. Wang & William W. Cohen, Extracting Personal Names from Emails: Applying Named Entity Recognition to Informal Text, in HLT/EMNLP 2005

Online version

Extracting Personal Names from Emails

Summary

Task: extract person names from emails

Techniques: treating NER as tagging. CRF model is used for this task.

Contribution:

  • email-specific feature set.
  • The authors found that repetitions within single document are more often in newwires while repetitions occurred in multiple files are more often in emails.

based on this discovery.

Recall-enhancing Techniques:

  • single document repetition (SDR): mark repeated tokens within a single document as a name.
  • multiple document repetition (MDR): mark repeated tokens appearing in multiple documents as a name.

Related papers

  • Search-based Structured Prediction: This is the journal version of the paper that introduces the SEARN algorithm - Daume_et_al,_ML_2009.