Difference between revisions of "User talk:Xxiong"

@@ Line 1: / Line 1: @@
-== Citation ==
-Einat Minkov, Richard C. Wang & William W. Cohen, Extracting Personal Names from Emails:
-Applying Named Entity Recognition to Informal Text, in HLT/EMNLP 2005
-== Online version ==
-[http://www.cs.cmu.edu/~einat/email.pdf Extracting Personal Names from Emails]
-== Summary ==
-Task: extract person names from emails
-Techniques: treating NER as tagging. CRF model is used for this task.
-Contribution:
-* email-specific feature set.
-* The authors found that repetitions within single document are more often in newwires while repetitions occurred in multiple files are more often in emails.
-Based on this discovery, the authors introduced a new recall-enhancing method which is appropriate for emails.
-Recall-enhancing Techniques:
-* single document repetition (SDR): mark repeated tokens within a single document as a name.
-* multiple document repetition (MDR): mark repeated tokens appearing in multiple documents as a name.
-* inferred dictionaries: Build a dictionary from preliminary names from an extractor learned from training data.
-Then, perform filtering process based on predicted frequency (PF) and inverse document frequency (IDF).
-Words with low PF.IDF scores are either highly ambiguous in the corpus or the common words, which inaccurately predicted as names by the extractor.
-* PF: measures the ratio between the number of times that a word predicted as part of a name and the number of occurrences of this word.
-* IDF: measures word frequency.
-== Related papers ==
-* '''Search-based Structured Prediction''': This is the journal version of the paper that introduces the [[UsesMethod::SEARN]] algorithm - [[RelatedPaper::Daume_et_al,_ML_2009]].

Difference between revisions of "User talk:Xxiong"

Latest revision as of 16:53, 8 October 2010

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools