Selen writeup of Borthwick et al.

From Cohen Courses
Jump to navigationJump to search

This is a review of Borthwick_1998_exploiting_diverse_knowledge_sources_via_maximum_entropy_in_named_entity_recognition by user:Selen.

In this paper, they built a named entity recognition system using n tags (which correspond to 29 tags) with using four different knowledge sources, capitialization, lexical, positional features and combining with a dictionary. Given the time that this paper is written feature sets remains primitive, however it might be novel in those period in the sense that they combine several different features.

What I like about this paper, is that they provide a comprehensive comparison between different named entity recognition system and also on different data. They also output the results after combining different system with themselves.

What I don't like about this paper is they could have used a more advanced feature selection method, with more advanced features trigrams, bigrams etc. but I assume since it is an old paper, this might be too much to ask.