Brill, CL 1995
Brill, E. 1995. Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging. Computational Linguistics. 21. 4. p543-565
Transformation Based Learning is an algorithm that learns a sequence of transformations to improve tagging on some baseline tagger. Transformations are broken down into two components: a triggering event (such as if the previous word is a determiner) and a re-write rule (such as change tag from modal to noun). Some advantages of Transformation based learning include the following: simple conceptually, TBL can be adapted to different learning problems, rich triggers/rules can make use of specific information and context, seemingly resistant to over-fitting(observed empirically, not entirely understood). When using Transformation based learning, a number of things should be considered: when constructing all possible transformations should we manually create rules or make templates?; the potentially huge search space can be problematic so you may need to use linguistic intuition to limit space; there are no probabilities/confidence associated with results; transformations in one environment could affect application in another: should transformations be applied immediately or only after entire corpus is examined for triggering conditions? what order do we process corpus? (left-to-right or right-to-left).
The authors described the use of Transformation Based Learning on POS Tagging. When compared to a markov-model based POS tagger, the TBL Tagger was able to achieve comparable tagging accuracy with a number of rules which is much smaller than the number of context probabilities calculated for stocastic tagger, and can do so with a much smaller sized training corpus. Also initial rules contribute most to tagging accuracy (say first 100 or 200), and rest improve performance marginally.
The learning algorithm is summarized as follows (see Figure 1 from paper as well):
- Pass un-annotated corpus (training data) through initial-state annotator
- Compare against truth to get current score (based on number of classification errors)
- Loop until no transformation can be found to improve score (Greedy search)
- Consider all transformations rules applied to training data, select best
- Apply to transformation to data & get current score
- Add transformation to ordered transformation list
Transformations are applied as follows:
- Run initial-state annotator on unseen data
- Loop through ordered list of transformations, and apply each transformation.
Example usage: POS Tagging
- Each word assigned to most likely POS tag based on training corpus
- Non-lexical: Based mainly on tags of words located near the target position
- Lexical: Based mainly on words in surrounding context
- Text Chunking using Transformation-Based Learning: Transformation-based learning applied to Noun-Phrase Chunking - Ramshaw_&_Marcus,_1995.
- Tagging gene and protein names in biomedical text: Transformation-based learning applied to gene & protein tagging, for system called Abgene - Tanabe_&_Wilbur,_2002.
- Sense Deduction: The Power of Peewees Applied to SENSEVAL-2 Sweedish Lexical Sample Task: Transformation-based learning applied to word sense disambiguation - Lager_&_Zinovjeva,_2001.