Difference between revisions of "Brill, CL 1995"
PastStudents (talk | contribs) |
PastStudents (talk | contribs) |
||
(7 intermediate revisions by the same user not shown) | |||
Line 11: | Line 11: | ||
This [[Category::paper]] introduces a learning technique called "Transformation-based error-driven learning", a.k.a. [[UsesMethod::Transformation Based Learning]] (TBL). | This [[Category::paper]] introduces a learning technique called "Transformation-based error-driven learning", a.k.a. [[UsesMethod::Transformation Based Learning]] (TBL). | ||
− | + | Transformation Based Learning is an algorithm that learns a sequence of transformations to improve tagging on some baseline tagger. Transformations are broken down into two components: a ''triggering event'' (such as if the previous word is a determiner) and a ''re-write rule'' (such as change tag from modal to noun). Some advantages of Transformation based learning include the following: simple conceptually, TBL can be adapted to different learning problems, rich triggers/rules can make use of specific information and context, seemingly resistant to over-fitting(observed empirically, not entirely understood). When using Transformation based learning, a number of things should be considered: when constructing all possible transformations should we manually create rules or make templates?; the potentially huge search space can be problematic so you may need to use linguistic intuition to limit space; there are no probabilities/confidence associated with results; transformations in one environment could affect application in another: should transformations be applied immediately or only after entire corpus is examined for triggering conditions? what order do we process corpus? (left-to-right or right-to-left). | |
− | |||
− | |||
− | |||
+ | The authors described the use of Transformation Based Learning on [[AddressesProblem::POS Tagging]]. When compared to a markov-model based POS tagger, the TBL Tagger was able to achieve comparable tagging accuracy with a number of rules which is much smaller than the number of context probabilities calculated for stocastic tagger, and can do so with a much smaller sized training corpus. Also initial rules contribute most to tagging accuracy (say first 100 or 200), and rest improve performance marginally. | ||
== Transformation-Based Learning == | == Transformation-Based Learning == | ||
− | '''The learning algorithm is summarized as follows''': | + | '''The learning algorithm is summarized as follows''' (see Figure 1 from paper as well): |
− | * | + | * Pass un-annotated corpus (training data) through initial-state annotator |
+ | * Compare against truth to get current score (based on number of classification errors) | ||
+ | * Loop until no transformation can be found to improve score (Greedy search) | ||
+ | ** Consider all transformations rules applied to training data, select best | ||
+ | ** Apply to transformation to data & get current score | ||
+ | ** Add transformation to ordered transformation list | ||
[[File:brill95_fig1.png]] | [[File:brill95_fig1.png]] | ||
+ | |||
'''Transformations are applied as follows''': | '''Transformations are applied as follows''': | ||
Line 29: | Line 33: | ||
− | == | + | == Example usage: POS Tagging == |
− | + | * Initial-state: | |
+ | ** Each word assigned to most likely POS tag based on training corpus | ||
+ | * Rules: | ||
+ | ** Non-lexical: Based mainly on tags of words located near the target position | ||
+ | ** Lexical: Based mainly on words in surrounding context | ||
== Related papers == | == Related papers == | ||
− | + | * '''Text Chunking using Transformation-Based Learning''': Transformation-based learning applied to Noun-Phrase Chunking - [[RelatedPaper::Ramshaw_&_Marcus,_1995]]. | |
− | * | + | * '''Tagging gene and protein names in biomedical text''': Transformation-based learning applied to gene & protein tagging, for system called ''Abgene'' - [[RelatedPaper::Tanabe_&_Wilbur,_2002]]. |
+ | * '''Sense Deduction: The Power of Peewees Applied to SENSEVAL-2 Sweedish Lexical Sample Task''': Transformation-based learning applied to word sense disambiguation - [[RelatedPaper::Lager_&_Zinovjeva,_2001]]. |
Latest revision as of 01:28, 23 November 2010
Contents
Citation
Brill, E. 1995. Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging. Computational Linguistics. 21. 4. p543-565
Online version
Summary
This paper introduces a learning technique called "Transformation-based error-driven learning", a.k.a. Transformation Based Learning (TBL).
Transformation Based Learning is an algorithm that learns a sequence of transformations to improve tagging on some baseline tagger. Transformations are broken down into two components: a triggering event (such as if the previous word is a determiner) and a re-write rule (such as change tag from modal to noun). Some advantages of Transformation based learning include the following: simple conceptually, TBL can be adapted to different learning problems, rich triggers/rules can make use of specific information and context, seemingly resistant to over-fitting(observed empirically, not entirely understood). When using Transformation based learning, a number of things should be considered: when constructing all possible transformations should we manually create rules or make templates?; the potentially huge search space can be problematic so you may need to use linguistic intuition to limit space; there are no probabilities/confidence associated with results; transformations in one environment could affect application in another: should transformations be applied immediately or only after entire corpus is examined for triggering conditions? what order do we process corpus? (left-to-right or right-to-left).
The authors described the use of Transformation Based Learning on POS Tagging. When compared to a markov-model based POS tagger, the TBL Tagger was able to achieve comparable tagging accuracy with a number of rules which is much smaller than the number of context probabilities calculated for stocastic tagger, and can do so with a much smaller sized training corpus. Also initial rules contribute most to tagging accuracy (say first 100 or 200), and rest improve performance marginally.
Transformation-Based Learning
The learning algorithm is summarized as follows (see Figure 1 from paper as well):
- Pass un-annotated corpus (training data) through initial-state annotator
- Compare against truth to get current score (based on number of classification errors)
- Loop until no transformation can be found to improve score (Greedy search)
- Consider all transformations rules applied to training data, select best
- Apply to transformation to data & get current score
- Add transformation to ordered transformation list
Transformations are applied as follows:
- Run initial-state annotator on unseen data
- Loop through ordered list of transformations, and apply each transformation.
Example usage: POS Tagging
- Initial-state:
- Each word assigned to most likely POS tag based on training corpus
- Rules:
- Non-lexical: Based mainly on tags of words located near the target position
- Lexical: Based mainly on words in surrounding context
Related papers
- Text Chunking using Transformation-Based Learning: Transformation-based learning applied to Noun-Phrase Chunking - Ramshaw_&_Marcus,_1995.
- Tagging gene and protein names in biomedical text: Transformation-based learning applied to gene & protein tagging, for system called Abgene - Tanabe_&_Wilbur,_2002.
- Sense Deduction: The Power of Peewees Applied to SENSEVAL-2 Sweedish Lexical Sample Task: Transformation-based learning applied to word sense disambiguation - Lager_&_Zinovjeva,_2001.