Yandongl writeup of Cohen 2000 KDD

From Cohen Courses
Jump to navigationJump to search

This is a review of Cohen_2000_hardening_soft_information_sources by user:Yandongl.

This paper is about finding hard database from soft database. By inferring the coreference between entities/references in soft databases, authors were able to map from soft database to hard database. A probabilistic model is proposed which is to maximize the joint probability of P(H,I.S). H is for hard databases, I for interpretations and S for soft databases. Finding n optimal hardening is a NP-hard problem and here authors used a greedy algorithm to solve it.