Difference between revisions of "Wu et al KDD 2008"
(Created page with '== Citation == Wu, F., Hoffmann, R. and Weld, D. 2008. Information Extraction from Wikipedia: Moving Down the Long Tail. In Proceedings of the 14th International Conference on K…') |
|||
Line 19: | Line 19: | ||
== Related papers == | == Related papers == | ||
− | This paper improves a self-supervised information extractor first described in [[RelatedPaper::Wu and Weld CIKM 2007]]. The shrinkage technique uses a refined ontology, the output of | + | This paper improves a self-supervised information extractor first described in [[RelatedPaper::Wu and Weld CIKM 2007]]. The shrinkage technique uses a refined ontology, the output of Kylin Ontology Generator, an autonomous system presented in [[RelatedPaper::Wu and Weld WWW 2008]]). The retraining technique uses TextRunner, an open information extractor described in [[RelatedPaper::Banko et al IJCAI 2007]]. |
Revision as of 23:27, 27 September 2011
Citation
Wu, F., Hoffmann, R. and Weld, D. 2008. Information Extraction from Wikipedia: Moving Down the Long Tail. In Proceedings of the 14th International Conference on Knowledge Discovery and Data Mining, pp. 731–739, ACM, New York.
Online version
Summary
This paper introduces three techniques for increasing recall of information extraction from Wikipedia's classes with a small number of articles (the long tail of sparse classes): shrinkage over a refined ontology, retraining using open information extractors and supplementing results by extracting from the general Web.
Experimental results
...
Related papers
This paper improves a self-supervised information extractor first described in Wu and Weld CIKM 2007. The shrinkage technique uses a refined ontology, the output of Kylin Ontology Generator, an autonomous system presented in Wu and Weld WWW 2008). The retraining technique uses TextRunner, an open information extractor described in Banko et al IJCAI 2007.