Yandongl writeup of Cohen 2003

From Cohen Courses
Revision as of 06:10, 12 December 2009 by Yandongl (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

This is a review of Cohen_2003_a_comparison_of_string_distance_metrics_for_name_matching_tasks by user:Yandongl.


In this paper authors compared between different string distance metrics such as edit distance like metric (Levenstein etc.) and Token-based distances (Jaccard similarity etc.) as well as hybrid distance functions such as recursive matching scheme and soft TFIDF. Experiments showed that TFIDF performs the best among token-based similarities and Monge-elkan outperforms other edit-distance like ones.