Difference between revisions of "Daume et al, 2006"

From Cohen Courses
Jump to navigationJump to search
 
(7 intermediate revisions by the same user not shown)
Line 12: Line 12:
  
 
The key points of the paper are:
 
The key points of the paper are:
* Authors state that SEARN is efficient, widely applicable, theoretically justified, and simple.  
+
* Authors state that [[UsesMethod::SEARN]] is efficient, widely applicable, theoretically justified, and simple.  
* SEARN looks at problems as a search problems, and learn classifiers that walk through the search space in a good way.
+
* [[UsesMethod::SEARN]] looks at problems a search problems, and learns classifiers that walk through the search space in a good way.
 
* Authors looked at 3 sample problems: [[AddressesProblem::Sequence Labeling]], [[AddressesProblem::Parsing]], and [[AddressesProblem::Machine Translation]]
 
* Authors looked at 3 sample problems: [[AddressesProblem::Sequence Labeling]], [[AddressesProblem::Parsing]], and [[AddressesProblem::Machine Translation]]
* Efficacy of SEARN hinges on ability to compute an optimal/near-optimal policy. When optimal policy not available, authors suggest performing explicit search as an approximation. For segmentaiton and parsing, optimal policy is closed form; for summarization and machine translation, the optimal policy is not available.
+
* Efficacy of [[UsesMethod::SEARN]] hinges on ability to compute an optimal/near-optimal policy. When an optimal policy is not available, authors suggest performing explicit search as an approximation. For segmentaiton and parsing, optimal policy is closed form; for summarization and machine translation, the optimal policy is not available.
  
 
== Example SEARN Usage ==
 
== Example SEARN Usage ==
  
 
'''Sequence Labeling'''
 
'''Sequence Labeling'''
* Discussed POS-tagging and NP chunking
+
* Discussed SEARN's application to [[AddressesProblem::POS tagging]] and [[AddressesProblem::NP chunking]]
  
 
''Tagging''
 
''Tagging''
* Produce label sequence from input sequence. Loss function is Hamming loss, search as left-to-right greedy search.  
+
* Task is to produce a label sequence from an input sequence.  
 +
* Search framed as left-to-right greedy search.
 +
* ''Loss function'': Hamming loss
 
* Optimal Policy:
 
* Optimal Policy:
 
[[File:op-tagging.png]]
 
[[File:op-tagging.png]]
  
  
* ''Chunking'': Joint segmentation and labeling problem. Loss function is F1 measure
+
''NP Chunking''
 +
* Chunking is a joint segmentation and labeling problem.  
 +
* ''Loss function'': F1 measure
 
* Optimal Policy:
 
* Optimal Policy:
 
[[File:op-chunking.png]]
 
[[File:op-chunking.png]]
  
 
'''Parsing'''
 
'''Parsing'''
* Looked at dependency parsing with shift-reduce framework
+
* Looked at dependency parsing with a shift-reduce framework.
* ''Loss funtion'': Hamming loss over dependencies
+
* ''Loss funtion'': Hamming loss over dependencies.
 
* ''Decisions'': shift/reduce
 
* ''Decisions'': shift/reduce
 
* ''Optimal Policy'':
 
* ''Optimal Policy'':
Line 40: Line 44:
  
 
'''Machine Translation'''
 
'''Machine Translation'''
* Frame as left-to-right translation problem. Search space over prefixes of translations, actions are adding a word (or phrase to end of existing translation)
+
* Framed task as a left-to-right translation problem.  
 +
* Search space over prefixes of translations.
 +
* Actions are adding a word (or phrase to end of existing translation.
 
* ''Loss function'': 1 - BLEU or 1 - NIST
 
* ''Loss function'': 1 - BLEU or 1 - NIST
 
* ''Optimal policy'': given set of reference translations R, English translation prefix e_1, ... e_i-1, what word (or phrase) should be produced next / are we finished.
 
* ''Optimal policy'': given set of reference translations R, English translation prefix e_1, ... e_i-1, what word (or phrase) should be produced next / are we finished.
Line 46: Line 52:
 
== Related papers ==
 
== Related papers ==
  
* '''Search-based Structured Prediction''': This is the journal version of the paper that introduces the SEARN algorithm - [[RelatedPaper::Daume_et_al,_ML_2009]].
+
* '''Search-based Structured Prediction''': This is the journal version of the paper that introduces the [[UsesMethod::SEARN]] algorithm - [[RelatedPaper::Daume_et_al,_ML_2009]].

Latest revision as of 17:21, 30 September 2010

Citation

Duame, H., Langford, J., and Marcu, D. 2006. SEARN in Practice. Unpublished Manuscript

Online version

SEARN in Practice

Summary

This unpublished manuscript describes how SEARN can be used for three Natural Language Processing related tasks: Sequence Labeling, Parsing, and Machine Translation

The key points of the paper are:

  • Authors state that SEARN is efficient, widely applicable, theoretically justified, and simple.
  • SEARN looks at problems a search problems, and learns classifiers that walk through the search space in a good way.
  • Authors looked at 3 sample problems: Sequence Labeling, Parsing, and Machine Translation
  • Efficacy of SEARN hinges on ability to compute an optimal/near-optimal policy. When an optimal policy is not available, authors suggest performing explicit search as an approximation. For segmentaiton and parsing, optimal policy is closed form; for summarization and machine translation, the optimal policy is not available.

Example SEARN Usage

Sequence Labeling

Tagging

  • Task is to produce a label sequence from an input sequence.
  • Search framed as left-to-right greedy search.
  • Loss function: Hamming loss
  • Optimal Policy:

Op-tagging.png


NP Chunking

  • Chunking is a joint segmentation and labeling problem.
  • Loss function: F1 measure
  • Optimal Policy:

Op-chunking.png

Parsing

  • Looked at dependency parsing with a shift-reduce framework.
  • Loss funtion: Hamming loss over dependencies.
  • Decisions: shift/reduce
  • Optimal Policy:

Op-parsing.png

Machine Translation

  • Framed task as a left-to-right translation problem.
  • Search space over prefixes of translations.
  • Actions are adding a word (or phrase to end of existing translation.
  • Loss function: 1 - BLEU or 1 - NIST
  • Optimal policy: given set of reference translations R, English translation prefix e_1, ... e_i-1, what word (or phrase) should be produced next / are we finished.

Related papers

  • Search-based Structured Prediction: This is the journal version of the paper that introduces the SEARN algorithm - Daume_et_al,_ML_2009.