Rbosaghz writeup of Wang ACL 2009

From Cohen Courses
Jump to navigationJump to search

Wang_2009_automatic_set_instance_extraction_using_the_web by user:Rbosaghz.

This paper describes another system for the task of set expansion, called ASIE. The task solved is slightly different from regular set expansion in that ASIE is given the name of a semantic class as input (e.g., cars) and automatically outputs instances belonging to the class (e.g. Honda, BMW). ASIE works well in languages other than English (ASIE builds upon SEAL, which also works well across languages), and takes the web as input.

ASIE is composed of three main components: "Noisy Instance Generator, Reranker, and Bootstrapper. The Noisy Instance Generator extracts a set of candidate instances given a semantic class name, and ranks the instances by using a simple ranking model. The Reranker re-ranks the instances using evidence from semi-structured web documents such that noisy (irrelevant) ones are ranked lower in the list. The Bootstrapper enhances the quality and completeness of the ranked list by using an unsupervised iterative technique." The Reranker and Bootstrapper rely on SEAL, a paper we studied last class.

The evaluation performed is against a system by Kozareva et al. ASIE beats the rival system on several classes.