Difference between revisions of "Bbd writeup of hardening soft information sources"

From Cohen Courses
Jump to navigationJump to search
 
m (1 revision)
 
(No difference)

Latest revision as of 11:42, 3 September 2010

This is a review of Cohen_2000_hardening_soft_information_sources by user:Bbd.

This paper addresses a problems related to soft databases which are created by heuristically extracting information from various sources and may have inconsistencies and duplication. They present a formal model of soft database as a noisy version of hard database. They then infer the most likely hard database given a particular soft database.

They define soft database as set of instances of a fixed set of relations over a set of references inferred from the data. The hardening determines co-reference relations between references in soft database.

I liked the efficient greedy implementation they proposed for the NP-hard problem of finding optimal hard database.