|
|
| (7 intermediate revisions by the same user not shown) |
| Line 1: |
Line 1: |
| − | == Citation ==
| |
| | | | |
| − | Einat Minkov, Richard C. Wang & William W. Cohen, Extracting Personal Names from Emails:
| |
| − | Applying Named Entity Recognition to Informal Text, in HLT/EMNLP 2005
| |
| − |
| |
| − | == Online version ==
| |
| − |
| |
| − | [http://www.cs.cmu.edu/~einat/email.pdf Extracting Personal Names from Emails]
| |
| − |
| |
| − | == Summary ==
| |
| − | Task: NER from emails
| |
| − |
| |
| − | Techniques: treating NER as tagging. CRF model is used for this task.
| |
| − |
| |
| − | Contribution:
| |
| − | * email-specific feature set
| |
| − |
| |
| − | repetitions within single document are more often in newwires while repetitions occurred in multiple files are more often in emails.
| |
| − |
| |
| − | == Example SEARN Usage ==
| |
| − |
| |
| − | '''Sequence Labeling'''
| |
| − | * Discussed SEARN's application to [[AddressesProblem::POS tagging]] and [[AddressesProblem::NP chunking]]
| |
| − |
| |
| − | ''Tagging''
| |
| − | * Task is to produce a label sequence from an input sequence.
| |
| − | * Search framed as left-to-right greedy search.
| |
| − | * ''Loss function'': Hamming loss
| |
| − | * Optimal Policy:
| |
| − | [[File:op-tagging.png]]
| |
| − |
| |
| − |
| |
| − | ''NP Chunking''
| |
| − | * Chunking is a joint segmentation and labeling problem.
| |
| − | * ''Loss function'': F1 measure
| |
| − | * Optimal Policy:
| |
| − | [[File:op-chunking.png]]
| |
| − |
| |
| − | '''Parsing'''
| |
| − | * Looked at dependency parsing with a shift-reduce framework.
| |
| − | * ''Loss funtion'': Hamming loss over dependencies.
| |
| − | * ''Decisions'': shift/reduce
| |
| − | * ''Optimal Policy'':
| |
| − | [[File:op-parsing.png]]
| |
| − |
| |
| − | '''Machine Translation'''
| |
| − | * Framed task as a left-to-right translation problem.
| |
| − | * Search space over prefixes of translations.
| |
| − | * Actions are adding a word (or phrase to end of existing translation.
| |
| − | * ''Loss function'': 1 - BLEU or 1 - NIST
| |
| − | * ''Optimal policy'': given set of reference translations R, English translation prefix e_1, ... e_i-1, what word (or phrase) should be produced next / are we finished.
| |
| − |
| |
| − | == Related papers ==
| |
| − |
| |
| − | * '''Search-based Structured Prediction''': This is the journal version of the paper that introduces the [[UsesMethod::SEARN]] algorithm - [[RelatedPaper::Daume_et_al,_ML_2009]].
| |