Selen writeup of TextRunner

This is a review of Banko_2007_open_information_extraction_from_the_web by user:Selen.

In this paper they present an open domain information extraction system, TextRunner. Their goal is to extract relation given only the corpus. They do it by using only one pass over the corpus, in other words they do not perform bootstrapping. They compare their approach to KnowItAll, and they claim that their method is an improrvement sincd it doesn't rely on a search engine(recall that this was the biggest issue with KnowItAll) and they don't take any relation specific input.

TextRunner has three modules:

Self-supervised learner
Single-Pass Extractor
Redundancy based Assessor

To evaluate the accuracy of the relation, they embed a classifier, which is Naive Bayes. As a result they report a 6 percent improvement over KnowItAll system.

I like the idea of o-crf

Selen writeup of TextRunner

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools