Turney 2006 A Uniform Approach to Analogies, Synonyms, Antonyms, and Associations, COLING 2008
Contents
Citation
Turney, P.D. (2008), A uniform approach to analogies, synonyms, antonyms, and associations, Proceedings of the 22nd International Conference on Computational Linguistics (COLING 2008), Manchester, UK, pp. 905-912.
Online version
Summary
In this paper, the authors suggest a supervised corpus-based method to classifying relations such as analogies, synonyms, etc. They use Support Vector Machines, using patterns the given word pair occurs in as features. This method achieved competitive results against the existing method that are designed for specific tasks (synonym only, etc.).
Brief description of the method
For any word pair X:Y, it extracts all the patterns "(0~1 words) X (0~3 words) Y (0~1 words)". Then the system generates more patterns out of it by marking some of the words (excluding X and Y) with asterisks. This generates patterns when one original pattern of length n is found. To keep the number of features to a manageable size, the authors use only the top patterns when sorted in an decreasing order by the number of word pairs occurred in each pattern (N is the number of word pairs, and k is a constant). This came from an intuition that patterns shared by many pairs are more useful. Once the feature selection is done, feature vectors for each pair is generated by taking (f is the frequency of a pattern) and then normalizing each column of the feature vectors.