Rbalasub writeup of Etzioni et al.

From Cohen Courses
Jump to navigationJump to search

A review of Etzioni_2004_methods_for_domain_independent_information_extraction_from_the_web_an_experimental_comparison by user:rbalasub

This work addresses the task of collecting sets of entities from the web by augmenting the domain independent KnowItAll system with three approaches namely

  • Rule Learning - which looks at contexts around seed entities (like SEAL)
  • Subclass Extraction
  • List Extraction - inducing wrappers for sites with lists of relevant entities

This enhancement increases the recall of the fact collection process.