Klein et al, CONLL 2003
Citation
Dan Klein, Joseph Smarr, Huy Nguyen and Christopher D. Manning. 2003. Named Entity Recognition with Character-Level Model. In Proceedings of CoNLL-2003.
Online version
Summary
In this paper, the authors propose using character representations instead of word representations in the Named Entity Recognition task. The first model proposed is the character-level HMM with minimal context information and the second model is maximum-entropy conditional markov model with rich context features.
In character-level HMM, each character is represented with one state which depends only on the previous state. And each character observation depends on the current state and on the previous n-1 observations.
In order to prevent characters of a word getting different state labels, they represent each state with a pair(t,k) where t is entity type and k is length of time of being in a state
A previous paper that uses character-level approach was the Cucerzan and Yarowsky, SIGDAT 1999. In that paper the authors used the prefix and suffix tries but in this paper all the characters are used.