Difference between revisions of "Klein et al, CONLL 2003"
From Cohen Courses
Jump to navigationJump to searchPastStudents (talk | contribs) (Created page with '== Citation == Dan Klein, Joseph Smarr, Huy Nguyen and Christopher D. Manning. 2003. Named Entity Recognition with Character-Level Model. In Proceedings of CoNLL-2003. == Onli…') |
PastStudents (talk | contribs) |
||
Line 8: | Line 8: | ||
== Summary == | == Summary == | ||
− | + | In this [[Category::paper]], the authors propose using character representations instead of word representations in the [[AddressesProblem::Named Entity Recognition]] task. | |
+ | In word model, | ||
− | + | [[UsesMethod::Conditional Random Fields]] approach to the Arabic [[AddressesProblem::Named Entity Recognition]] problem. Arabic is a highly inflectional language in which words can take both prefixes and suffixes. In addition to the complex morphology of Arabic, there is also the absence of capital letters which makes NER task even harder. | |
+ | |||
+ | A previous paper that uses character-level approach was the [[RelatedPaper::Cucerzan and Yarowsky, SIGDAT 1999]]. In that paper the authors used the prefix and suffix tries but in this paper all the characters are used. |
Revision as of 21:54, 30 November 2010
Citation
Dan Klein, Joseph Smarr, Huy Nguyen and Christopher D. Manning. 2003. Named Entity Recognition with Character-Level Model. In Proceedings of CoNLL-2003.
Online version
Summary
In this paper, the authors propose using character representations instead of word representations in the Named Entity Recognition task. In word model,
Conditional Random Fields approach to the Arabic Named Entity Recognition problem. Arabic is a highly inflectional language in which words can take both prefixes and suffixes. In addition to the complex morphology of Arabic, there is also the absence of capital letters which makes NER task even harder.
A previous paper that uses character-level approach was the Cucerzan and Yarowsky, SIGDAT 1999. In that paper the authors used the prefix and suffix tries but in this paper all the characters are used.