Yandongl writeup of Cohen 2003

From Cohen Courses
Jump to navigationJump to search

This is a review of Cohen_2003_a_comparison_of_string_distance_metrics_for_name_matching_tasks by user:Yandongl.


In this paper authors compared between different string distance metrics such as edit distance like metric (Levenstein etc.) and Token-based distances (Jaccard similarity etc.) as well as hybrid distance functions such as recursive matching scheme and soft TFIDF. Experiments showed that TFIDF performs the best among token-based similarities and Monge-elkan outperforms other edit-distance like ones.