Nlao writeup of Banko 2007

From Cohen Courses
Jump to navigationJump to search

This is a review of Banko_2007_open_information_extraction_from_the_web by user:Nlao.

This work fix the main problem of KnowItAll: relying on an existing web search engine. It therefore achieved significant improvement of speed.

However, I doubt the viability of single-pass extraction. Imagine the approaches in KnowItAll (domain independent pattern, domain specific rule, webpage tables, and subclass patterns) bootstrapping each other, there has to be multiple passes to the corpus. In this work, however, no bootstrapping is used.

[minor points] The description of self-supervised learner is not clear to me. Were the domain independent patterns used to generate initial seeds?