Mnduong writeup of Borthwick et al.

From Cohen Courses
Revision as of 10:42, 3 September 2010 by WikiAdmin (talk | contribs) (1 revision)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

This is a review of Borthwick_1998_exploiting_diverse_knowledge_sources_via_maximum_entropy_in_named_entity_recognition by user:mnduong.

  • This paper proposes a named entity recognition system using the maximum entropy framework.
  • The task is to identify names from 7 categories, from the MUC-7 testbed.
  • The method uses features such as capitalization, vocabulary words that appeared at least 3 times, section identifications, dictionary features. Furthermore, each feature is selected only if it's fired at least 3 times in the training data.
  • The method yields state-of-the-art results (among statistical systems) when used by itself and highest results ever reported when used in combination with handcrafted systems.