Difference between revisions of "Brill, CL 1995"

From Cohen Courses
Jump to navigationJump to search
Line 15: Line 15:
 
* Transformations are broken down into two components: a ''triggering event'' (such as if the previous word is a determiner) and a ''re-write rule'' (such as change tag from modal to noun)
 
* Transformations are broken down into two components: a ''triggering event'' (such as if the previous word is a determiner) and a ''re-write rule'' (such as change tag from modal to noun)
 
* Authors described the use of Transformation Based Learning on [[AddressesProblem::POS Tagging]].
 
* Authors described the use of Transformation Based Learning on [[AddressesProblem::POS Tagging]].
 +
* Some advantages of Transformation based learning:
 +
** Simple conceptually
 +
** Can be adapted to different learning problems
 +
** Rich triggers/rules that can make use of specific information and context
 +
** Seemingly resistant to over-fitting
 +
*** Empirical result, not entirely understood
 +
*** Always learn on whole data set
 +
 
* Some considerations:
 
* Some considerations:
 
** Constructing all possible transformations: manually create rules or make templates?, potentially huge search space can be problematic, may need to use linguistic intuition to limit space
 
** Constructing all possible transformations: manually create rules or make templates?, potentially huge search space can be problematic, may need to use linguistic intuition to limit space
Line 39: Line 47:
  
 
== Transformation-based Learning for POS Tagging ==
 
== Transformation-based Learning for POS Tagging ==
...
+
*
  
 
== Related papers ==
 
== Related papers ==
 
* '''SEARN in Practice''': This unpublished manuscript showcases three example problems where SEARN can be used - [[RelatedPaper::Daume_et_al,_2006]].
 
* '''SEARN in Practice''': This unpublished manuscript showcases three example problems where SEARN can be used - [[RelatedPaper::Daume_et_al,_2006]].

Revision as of 18:25, 31 October 2010

Citation

Brill, E. 1995. Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging. Computational Linguistics. 21. 4. p543-565

Online version

Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging

Summary

This paper introduces a learning technique called "Transformation-based error-driven learning", a.k.a. Transformation Based Learning (TBL).

The key points from the paper are:

  • Transformation Based Learning is an algorithm that learns a sequence of transformations to improve tagging on some baseline tagger
  • Transformations are broken down into two components: a triggering event (such as if the previous word is a determiner) and a re-write rule (such as change tag from modal to noun)
  • Authors described the use of Transformation Based Learning on POS Tagging.
  • Some advantages of Transformation based learning:
    • Simple conceptually
    • Can be adapted to different learning problems
    • Rich triggers/rules that can make use of specific information and context
    • Seemingly resistant to over-fitting
      • Empirical result, not entirely understood
      • Always learn on whole data set
  • Some considerations:
    • Constructing all possible transformations: manually create rules or make templates?, potentially huge search space can be problematic, may need to use linguistic intuition to limit space
    • No probabilities/confidence associated with results
    • Transformations in one environment could affect application in another: should transformations be applied immediately or only after entire corpus is examined for triggering conditions? what order do we process corpus? (left-to-right or right-to-left)

Transformation-Based Learning

The learning algorithm is summarized as follows (see Figure 1 from paper as well):

  • Pass un-annotated corpus (training data) through initial-state annotator
  • Compare against truth to get current score (based on number of classification errors)
  • Loop until no transformation can be found to improve score (Greedy search)
    • Consider all transformations rules applied to training data, select best
    • Apply to transformation to data & get current score
    • Add transformation to ordered transformation list

Brill95 fig1.png


Transformations are applied as follows:

  • Run initial-state annotator on unseen data
  • Loop through ordered list of transformations, and apply each transformation.


Transformation-based Learning for POS Tagging

Related papers

  • SEARN in Practice: This unpublished manuscript showcases three example problems where SEARN can be used - Daume_et_al,_2006.