Difference between revisions of "Bbd writeup of hardening soft information sources"
From Cohen Courses
Jump to navigationJump to searchm (1 revision) |
|
(No difference)
|
Latest revision as of 11:42, 3 September 2010
This is a review of Cohen_2000_hardening_soft_information_sources by user:Bbd.
This paper addresses a problems related to soft databases which are created by heuristically extracting information from various sources and may have inconsistencies and duplication. They present a formal model of soft database as a noisy version of hard database. They then infer the most likely hard database given a particular soft database.
They define soft database as set of instances of a fixed set of relations over a set of references inferred from the data. The hardening determines co-reference relations between references in soft database.
I liked the efficient greedy implementation they proposed for the NP-hard problem of finding optimal hard database.