Philgoo project abstract
From Cohen Courses
Jump to navigationJump to searchContents
About the project
In this study I compete with published NER results in (Borthwick, 1998) and apply the analyze of (Klein, 2002) to the same task. In the study I will implement well known HMM(JL), CMM(CL) models (which is expected to be better referring (Klein, 2002)) to compete with MENE and other competitors from (Borthwick, 1998), optimization issues are expected to arise in order to achieve a comparable score. After, by splitting the structure and estimation method (HMM, CMM) x (JL, CL, SCL) and evaluating each on MUC-7 data I will analyze the relation and characteristics of each as in (Klein, 2002). Also comparing with (Klein, 2002) will add higher intution for analyzing data vs (structure x estimation)
What data I will use
- MUC-7 as in (Borthwick, 1998)
Why you think it’s interesting
- Even with the most basic classification models achieving published accuracy rate is hard
- Apply (Klein, 2002)'s analyze on NER task
Relevent Background
- Experience in implementing naive bayes and logistic regression.
Evaluation
- Score by MUC guidline. Compare with published results (Borthwick, 1998)
- Compare NER scores from conditional structure versus conditional estimation functions as in (Klein, 2002)
Objective
- Thorough understanding of popular classification models
- Implementation issues, optimization problem in NER
Reference
- Conditional structure versus conditional estimation in NLP models, by D. Klein, C. D Manning. In Proceedings of the ACL-02 conference on Empirical methods in natural language processing-Volume 10, 2002.
- Exploiting diverse knowledge sources via maximum entropy in named entity recognition, by A. Borthwick, J. Sterling, E. Agichtein, R. Grishman. In Proceedings of the sixth workshop on very large corpora, 1998.