Difference between revisions of "Part of Speech Tagging"
From Cohen Courses
Jump to navigationJump to searchPastStudents (talk | contribs) |
PastStudents (talk | contribs) |
||
Line 9: | Line 9: | ||
Some common approaches to POS Tagging include the following: | Some common approaches to POS Tagging include the following: | ||
* '''Hidden Markov Models''' based approaches, sometimes referred to as stochastic algorithms in older literature | * '''Hidden Markov Models''' based approaches, sometimes referred to as stochastic algorithms in older literature | ||
− | * '''Dynamic Programming/Viterbi-like algorithms''' | + | * '''Transformation-based learning''' - Brill Tagger |
− | + | * '''Dynamic Programming/Viterbi-like algorithms''' - DeRose & Church, mentioned for historical reasons | |
== Example Systems == | == Example Systems == |
Revision as of 19:27, 31 October 2010
Summary
Part of Speech Tagging (or POS Tagging for short) is a task in the field of computational linguistics which looks at marking each word in a text corpus with the associated word categories known as parts of speech (such as noun, verb, or adjective), based on a word's definition and context of usage.
POS tagging can be useful as a preprocessing step in tasks like Parsing, and is also useful in tasks like Word Sense Disambiguation and Speech Synthesis.
Common Approaches
Some common approaches to POS Tagging include the following:
- Hidden Markov Models based approaches, sometimes referred to as stochastic algorithms in older literature
- Transformation-based learning - Brill Tagger
- Dynamic Programming/Viterbi-like algorithms - DeRose & Church, mentioned for historical reasons
Example Systems
- FastTag - open source implementation of Brill Tagger
- Stanford Log-linear Part-of-Speech Tagger
- OpenNLP Tagger - based on maximum entropy
- CRF Tagger - based on conditional random fields
- LingPipe - tool kit that contains models for POS tagging