Suranah writeup for Banko 2007

From Cohen Courses
Revision as of 11:42, 3 September 2010 by WikiAdmin (talk | contribs) (1 revision)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

This is a review of Banko_2007_open_information_extraction_from_the_web by user:Suranah.

The paper discusses a scalable Information Extraction system. Their system uses large amounts of redundancy in the web to extract information without any patterns or seeds. I found the a specific idea interesting- the use of more sophisticated techniques like parsing to both classify and learn instances for a more easier to handle, learn and scale classifier.

Despite some evaluation, I do find it sketchy. It would have been more interesting had they evaluated the system on the scale they evaluated its comparison with KnowItAll- it is odd that they could evaluate over 22,000 facts for just the comparison, but just 400 sentences for the evaluation.