Difference between revisions of "Finkel and Manning, EMNLP 2009. Nested Named Entity Recognition"

From Cohen Courses
Jump to navigationJump to search
Line 27: Line 27:
 
=== Features ===
 
=== Features ===
  
Due to the nested nature of the model, they were able to use features found in standard [[RelatedPaper::CRF]]-based [[NER]] systems and also features that are not possible with a [[CRF]]. Each word is labeled with its cluster from the above distributional similarity clustering. Local named entity features are for each entities a word is possibly part of. Similarly, pairs of adjacent tokens are tagged with pairwise named entity features if they are siblings in a subtree. Features are also used for cases where entities are embedded in one another.
+
Due to the nested nature of the model, they were able to use ''nested'' features in addition to those found in standard [[RelatedPaper::CRF]]-based [[NER]] systems. Each word is labeled with its cluster from the distributional similarity clustering. There are local named entity features are for each entity a word is possibly part of. Similarly, pairs of adjacent tokens are tagged with pairwise named entity features if they are siblings in a subtree. Features are also used for cases where entities are embedded in one another.
  
 
== Experimental Result ==
 
== Experimental Result ==

Revision as of 03:23, 27 September 2011

Nested Named Entity Recognition, by J. R Finkel, C. D Manning. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, 2009.

This Paper is available online [1].

Summary

This paper focuses on a variant of the Named Entity Recognition problem. They present a method for identifying nested named entities using a discriminative constituency parser.

Nested ne.png

An example of a nested named entity in the first 3 tokens of the example sentence, which standard "flat" NER systems are unable to distinguish.

Brief description of the method

The authors model each sentence as a constituent tree. Each named entity would correspond to a phrase in the tree (i.e a subtree). A root node would connect the entire sentence. In addition, the POS tags of non-entities are also modeled. The diagram above is one such example of a "named entity tree".

Annotated.png

The trees are first annotated and binarized (in a right branching manner) with parent and grandparent labels. After which, they train a discriminative constituency parser based on Finkel et al. ACL 2008.

The POS tags are jointly modeled with the named entities. Possible POS tags for each words are based on their distributional similarity. Words in the same clusters are allowed to have any of the same POS tags as in other words in the clusters. Due to the annotation of parent and grandparent labels on POS tags, words are limited to the kind of entities they can be. For instance, verbs would not be labeled with any entities.

Discriminative parser

The parser used here is a discriminatively trained, conditional random field based CFG parser of Finkel et al (2008). It is similar to a standard chart-based PCFG parser except that clique potentials are used instead of probabilities over spans.

Features

Due to the nested nature of the model, they were able to use nested features in addition to those found in standard CRF-based NER systems. Each word is labeled with its cluster from the distributional similarity clustering. There are local named entity features are for each entity a word is possibly part of. Similarly, pairs of adjacent tokens are tagged with pairwise named entity features if they are siblings in a subtree. Features are also used for cases where entities are embedded in one another.

Experimental Result

The authors performed experiments on the GENIA Corpus, JNLPBA corpus and Ancora.

GENIA results.png

Their system achieve significant performance gains over similar flat model semi-CRF NER system.

JNLPBA results.png

Ancora results.png

The author's system are generally perform better than flat models when evaluated on all the entities as compared to just on top-level entities. It demonstrates the relevance of modeling named entities hierarchy in an NER system.

Related Papers