Difference between revisions of "Kucuk and Yazici, FQAS 2009"
PastStudents (talk | contribs) |
PastStudents (talk | contribs) |
||
Line 22: | Line 22: | ||
== Related Papers == | == Related Papers == | ||
+ | There are only two other related papers for NER on Turkish texts. One of them use [[RelatedPaper::Cucerzan and Yarowsky, SIGDAT 1999]] used language-independent bootstrapping algorithm and the other one [[RelatedPaper::Tur et al, NLEJ 2003]] used statistical methods. |
Revision as of 13:22, 22 October 2010
Citation
Küçük, D. and Yazici, A. 2009. Named Entity Recognition Experiments on Turkish Texts. In Proceedings of the International Conference on Flexible Query Answering Systems. Roskilde, Denmark. T. Andreasen et al. (Eds.): FQAS 2009, LNAI 5822, pp. 524-535
Online version
Summary
This paper describes the first rule-based approach to the Named Entity Recognition task on Turkish texts. The authors used external several information sources in the system
- Lexical Resources : Dictionaries and list of well known entities
- Pattern Bases : Context patterns to identify entities
The authors experimented on news texts from METU Turkish Corpus and some additional sources such as child stories, historical texts and news video transcriptions. These texts were manually annotated by the authors.
For news articles they report an F-Measure of 78.7%. After the analysis of the results, the authors reported below cases as the reason of low accuracy.
- Precision of person name recognition is low because some common nouns, which may be used as a person name, are extracted as entities. A similar precision problem also occurs with patterns. These two problems can be due to the case of not using capitalization information.
- They reported problems of recognizing a compound organization entity as two entities.
The authors got even lower accuracy results when the system was applied to other domains due to the absence of entities at the lexical resources.
Related Papers
There are only two other related papers for NER on Turkish texts. One of them use Cucerzan and Yarowsky, SIGDAT 1999 used language-independent bootstrapping algorithm and the other one Tur et al, NLEJ 2003 used statistical methods.