Liuliu writeup of Krishnan 2006
This is a review of Krishnan_2006_an_effective_two_stage_model_for_exploiting_non_local_dependencies_in_named_entity_recognition by user:Liuliu.
This paper proposed a two-stage method to combine non-local information for named entity recognition. They use an original CRF to do named entity recognition based on local information and then use another CRF based on the outputs of the first CRF. This methods have several advantages including:
- (1) Efficiency: The computation cost is much lower compared with training a global CRF model as in Sutton 2004.
- (2) Exact inference: The inference of the two CRF are exact inference but not approximate, which makes the labeling more accuracy.
- (3) Make use of more global information: Sutton 2004 only uses "identical capitalized tokens". In this work, they include more global features(the different majority features)
I think this is the three biggest advantages of their work. Besides this, I also like the way that they did a statistical test and show the quantitative benefits when adding each of their majority features.
Overall, I like this method. Although as the beginning I feel this method is a little tricky and there is no elegant probabilistic model, their method is simple, efficient. It's sort of "Divide and Conquer".
As in my first project proposal I am very interested in this problem, I have several questions about this problem. All the papers that I read all try to capture the global dependencies that "similar" tokens tend to have same labels. However, what about different tokens? Is there any dependencies between different tokens that might also affect the named entity recognition results and could benefit NER if we consider those dependencies in our model? I think I will continue thinking this problem and trying to figure out some different dependencies.