Difference between revisions of "Suranah writeup for Banko 2007"

From Cohen Courses
Jump to navigationJump to search
m (1 revision)
 
(No difference)

Latest revision as of 10:42, 3 September 2010

This is a review of Banko_2007_open_information_extraction_from_the_web by user:Suranah.

The paper discusses a scalable Information Extraction system. Their system uses large amounts of redundancy in the web to extract information without any patterns or seeds. I found the a specific idea interesting- the use of more sophisticated techniques like parsing to both classify and learn instances for a more easier to handle, learn and scale classifier.

Despite some evaluation, I do find it sketchy. It would have been more interesting had they evaluated the system on the scale they evaluated its comparison with KnowItAll- it is odd that they could evaluate over 22,000 facts for just the comparison, but just 400 sentences for the evaluation.