Difference between revisions of "Carreras et al, CoNLL 2003"
Line 2: | Line 2: | ||
Xavier Carreras and Llu´ıs Marquez ` and Llu´ıs Padro´. 2003. A simple named entity extractor using AdaBoost. In Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4 (CONLL '03), Vol. 4. Association for Computational Linguistics, Stroudsburg, PA, USA, 152-155. | Xavier Carreras and Llu´ıs Marquez ` and Llu´ıs Padro´. 2003. A simple named entity extractor using AdaBoost. In Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4 (CONLL '03), Vol. 4. Association for Computational Linguistics, Stroudsburg, PA, USA, 152-155. | ||
− | |||
− | |||
== Online version == | == Online version == | ||
Line 11: | Line 9: | ||
== Summary == | == Summary == | ||
− | In this [[Category::paper]] author proposed a simple AdaBoost based approach for [[AddressesProblem::Named Entity Recognition]]. The approach takes two sub-step to solve this problem. First, | + | In this [[Category::paper]] author proposed a simple AdaBoost based approach for [[AddressesProblem::Named Entity Recognition]]. The approach takes two sub-step to solve this problem. First is recognition in which three binary classifiers used to label as one of B, I or O. Second NE classification is done by using multclass learning. |
== Brief description of the method == | == Brief description of the method == | ||
+ | The method used for this task was from the context of the current word. Each word in neighbourhood is coded as feature along with the relative position. Different kind of features user were: Lexical, Syntactic, Orthographic, Affixes, Word Type Patterns, Left Prediciotns, Bag-of-Word, Trigger Word and Gazetteer Features. | ||
+ | |||
+ | The recognition task was performed by three independent binary one-vs-all classifiers. Out of these three classifier having maximum confidence was used. | ||
+ | In Named Entity Classification module, multiclass multilabel AdaBoost.MH algorithm was used. The algorithm was performed with different parameters like three-class classification and four-class classification. Later performed the good. | ||
== Experimental Result == | == Experimental Result == | ||
+ | Results for recognition task, results were better for English than German. On English it has approximately 95% of precision and recall. | ||
− | + | In classification task, for English they achieved 95.14% accuracy while 85.12% for German. | |
== Related papers == | == Related papers == |
Revision as of 15:34, 25 September 2011
Contents
Citation
Xavier Carreras and Llu´ıs Marquez ` and Llu´ıs Padro´. 2003. A simple named entity extractor using AdaBoost. In Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4 (CONLL '03), Vol. 4. Association for Computational Linguistics, Stroudsburg, PA, USA, 152-155.
Online version
Summary
In this paper author proposed a simple AdaBoost based approach for Named Entity Recognition. The approach takes two sub-step to solve this problem. First is recognition in which three binary classifiers used to label as one of B, I or O. Second NE classification is done by using multclass learning.
Brief description of the method
The method used for this task was from the context of the current word. Each word in neighbourhood is coded as feature along with the relative position. Different kind of features user were: Lexical, Syntactic, Orthographic, Affixes, Word Type Patterns, Left Prediciotns, Bag-of-Word, Trigger Word and Gazetteer Features.
The recognition task was performed by three independent binary one-vs-all classifiers. Out of these three classifier having maximum confidence was used.
In Named Entity Classification module, multiclass multilabel AdaBoost.MH algorithm was used. The algorithm was performed with different parameters like three-class classification and four-class classification. Later performed the good.
Experimental Result
Results for recognition task, results were better for English than German. On English it has approximately 95% of precision and recall.
In classification task, for English they achieved 95.14% accuracy while 85.12% for German.