Difference between revisions of "Part of Speech Tagging"
From Cohen Courses
Jump to navigationJump to searchPastStudents (talk | contribs) |
PastStudents (talk | contribs) |
||
Line 18: | Line 18: | ||
== References / Links == | == References / Links == | ||
+ | * Webpage with links to many different POS tagger systems, from Statistical natural language processing and corpus-based computational linguistics: An annotated list of resources - [http://www-nlp.stanford.edu/links/statnlp.html#Taggers] | ||
* Wikipedia article on Part of Speech Tagging - [http://en.wikipedia.org/wiki/Part-of-speech_tagging] | * Wikipedia article on Part of Speech Tagging - [http://en.wikipedia.org/wiki/Part-of-speech_tagging] | ||
− |
Revision as of 19:17, 31 October 2010
Summary
Part of Speech Tagging (or POS Tagging for short) is a task in the field of computational linguistics which looks at marking each word in a text corpus with the associated word categories known as parts of speech (such as noun, verb, or adjective).
Common Approaches
Some common approaches to POS Tagging include the following:
- Hidden Markov Models based approaches, sometimes referred to as stochastic algorithms in older literature
- Dynamic Programming/Viterbi-like algorithms (DeRose & Church)
- Unsupervised approaches: Brill Tagger (Transformation-based learning), Constraint Grammar, Forward-Backward
Example Systems
- FastTag - open source implementation of Brill Tagger
- Stanford Log-linear Part-of-Speech Tagger
- OpenNLP Tagger - based on maximum entropy
- CRF Tagger - based on conditional random fields
- LingPipe - tool kit that contains models for POS tagging