Suranah writeup for Banko 2007
From Cohen Courses
Jump to navigationJump to searchThis is a review of Banko_2007_open_information_extraction_from_the_web by user:Suranah.
The paper discusses a scalable Information Extraction system. Their system uses large amounts of redundancy in the web to extract information without any patterns or seeds. I found the a specific idea interesting- the use of more sophisticated techniques like parsing to both classify and learn instances for a more easier to handle, learn and scale classifier.
Despite some evaluation, I do find it sketchy. It would have been more interesting had they evaluated the system on the scale they evaluated its comparison with KnowItAll- it is odd that they could evaluate over 22,000 facts for just the comparison, but just 400 sentences for the evaluation.