Nlao writeup of Banko 2007

From Cohen Courses
Revision as of 13:12, 2 November 2009 by Nlao (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

This is a review of Banko_2007_open_information_extraction_from_the_web by user:Nlao.

This work fix the main problem of KnowItAll: relying on an existing web search engine. It therefore achieved significant improvement of speed.

However, I doubt the viability of single-pass extraction. Imagine the approaches in KnowItAll (domain independent pattern, domain specific rule, webpage tables, and subclass patterns) bootstrapping each other, there has to be multiple passes to the corpus. In this work, however, no bootstrapping is used.

[minor points] The description of self-supervised learner is not clear to me. Were the domain independent patterns used to generate initial seeds?