Difference between revisions of "Klein et al, CONLL 2003"

From Cohen Courses
Jump to navigationJump to search
(Created page with '== Citation == Dan Klein, Joseph Smarr, Huy Nguyen and Christopher D. Manning. 2003. Named Entity Recognition with Character-Level Model. In Proceedings of CoNLL-2003. == Onli…')
 
 
(5 intermediate revisions by the same user not shown)
Line 8: Line 8:
  
 
== Summary ==
 
== Summary ==
This [[Category::paper]] describes a [[UsesMethod::Conditional Random Fields]] approach to the Arabic [[AddressesProblem::Named Entity Recognition]] problem. Arabic is a highly inflectional language in which words can take both prefixes and suffixes. In addition to the complex morphology of Arabic, there is also the absence of capital letters which makes NER task even harder.  
+
In this [[Category::paper]], the authors propose using character representations instead of word representations in the [[AddressesProblem::Named Entity Recognition]] task. The first model proposed is the character-level HMM with minimal context information and the second model is the maximum-entropy conditional markov model with rich context features.
  
Previous to this paper, the authors were using Maximum Entropy model ([[RelatedPaper::Benajiba et al, CICLing 2007]])
+
In character-level [[UsesMethod::HMM]], each character is represented with one state which depends only on the previous state. And each character observation depends on the current state and on the previous n-1 observations. In order to prevent characters of a word getting different state labels, they represent each state with a pair(t,k) where t is entity type and k is length of time of being in that state. They limit the use of k to n-gram history and represent the final state with F. They experiment with two model in which they used the preceding context in one , no context in the other one. Using context information slightly gave better performance then the model without the context. But still both models got much better results then using a word-level HMM.
 +
 
 +
The authors also built a character-base conditional markov model ,where they used previous classification decisions as features. They also used joint-tag sequence, longer distance sequence and tag-sequence features, letter type pattern features and additional more context features. Overall they showed that rich features sets produced from the context is helping.
 +
 
 +
The paper does not give any information on the used data set.
 +
 
 +
A previous paper that uses character-level approach was the [[RelatedPaper::Cucerzan and Yarowsky, SIGDAT 1999]]. In that paper the authors used the prefix and suffix tries but in this paper all the characters are used.

Latest revision as of 23:41, 30 November 2010

Citation

Dan Klein, Joseph Smarr, Huy Nguyen and Christopher D. Manning. 2003. Named Entity Recognition with Character-Level Model. In Proceedings of CoNLL-2003.

Online version

ACL Anthology

Summary

In this paper, the authors propose using character representations instead of word representations in the Named Entity Recognition task. The first model proposed is the character-level HMM with minimal context information and the second model is the maximum-entropy conditional markov model with rich context features.

In character-level HMM, each character is represented with one state which depends only on the previous state. And each character observation depends on the current state and on the previous n-1 observations. In order to prevent characters of a word getting different state labels, they represent each state with a pair(t,k) where t is entity type and k is length of time of being in that state. They limit the use of k to n-gram history and represent the final state with F. They experiment with two model in which they used the preceding context in one , no context in the other one. Using context information slightly gave better performance then the model without the context. But still both models got much better results then using a word-level HMM.

The authors also built a character-base conditional markov model ,where they used previous classification decisions as features. They also used joint-tag sequence, longer distance sequence and tag-sequence features, letter type pattern features and additional more context features. Overall they showed that rich features sets produced from the context is helping.

The paper does not give any information on the used data set.

A previous paper that uses character-level approach was the Cucerzan and Yarowsky, SIGDAT 1999. In that paper the authors used the prefix and suffix tries but in this paper all the characters are used.