Kucuk and Yazici, FQAS 2009

From Cohen Courses
Revision as of 14:23, 22 October 2010 by PastStudents (talk | contribs) (→‎Related Papers)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Citation

Küçük, D. and Yazici, A. 2009. Named Entity Recognition Experiments on Turkish Texts. In Proceedings of the International Conference on Flexible Query Answering Systems. Roskilde, Denmark. T. Andreasen et al. (Eds.): FQAS 2009, LNAI 5822, pp. 524-535

Online version

LNAI 5822

Summary

This paper describes the first rule-based approach to the Named Entity Recognition task on Turkish texts. The authors used external several information sources in the system

  • Lexical Resources : Dictionaries and list of well known entities
  • Pattern Bases : Context patterns to identify entities

The authors experimented on news texts from METU Turkish Corpus and some additional sources such as child stories, historical texts and news video transcriptions. These texts were manually annotated by the authors.

For news articles they report an F-Measure of 78.7%. After the analysis of the results, the authors reported below cases as the reason of low accuracy.

  • Precision of person name recognition is low because some common nouns, which may be used as a person name, are extracted as entities. A similar precision problem also occurs with patterns. These two problems can be due to the case of not using capitalization information.
  • They reported problems of recognizing a compound organization entity as two entities.

The authors got even lower accuracy results when the system was applied to other domains due to the absence of entities at the lexical resources.

Related Papers

There are only two other related papers for NER on Turkish texts. One of them Cucerzan and Yarowsky, SIGDAT 1999 used language-independent bootstrapping algorithm and the other one Tur et al, NLEJ 2003 used statistical methods.