Difference between revisions of "Part of Speech Tagging"

From Cohen Courses
Jump to navigationJump to search
Line 11: Line 11:
 
* '''Transformation-based learning''' - Brill Tagger
 
* '''Transformation-based learning''' - Brill Tagger
 
* '''Dynamic Programming/Viterbi-like algorithms''' - DeRose & Church, mentioned for historical reasons
 
* '''Dynamic Programming/Viterbi-like algorithms''' - DeRose & Church, mentioned for historical reasons
 +
 +
Sources of information/evidence often times used by POS taggers:
 +
* distribution of tags for the word isolation: P(t|w)
 +
* "Syntagmatic information"- some POS sequences are much more common than others due to syntactic constraints of the language
  
 
== Example Systems ==
 
== Example Systems ==

Revision as of 19:31, 31 October 2010

Summary

Part of Speech Tagging (or POS Tagging for short) is a task in the field of computational linguistics which looks at marking each word in a text corpus with the associated word categories known as parts of speech (such as noun, verb, or adjective), based on a word's definition and context of usage.

POS tagging can be useful as a preprocessing step in tasks like Parsing, and is also useful in tasks like Word Sense Disambiguation and Speech Synthesis.

Common Approaches

Some common approaches to POS Tagging include the following:

  • Hidden Markov Models based approaches, sometimes referred to as stochastic algorithms in older literature
  • Transformation-based learning - Brill Tagger
  • Dynamic Programming/Viterbi-like algorithms - DeRose & Church, mentioned for historical reasons

Sources of information/evidence often times used by POS taggers:

  • distribution of tags for the word isolation: P(t|w)
  • "Syntagmatic information"- some POS sequences are much more common than others due to syntactic constraints of the language

Example Systems

References / Links

  • Webpage with links to many different POS tagger systems, from Statistical natural language processing and corpus-based computational linguistics: An annotated list of resources - [1]
  • Wikipedia article on Part of Speech Tagging - [2]