Chklovski and Pantel (2004) Verbocean:Mining the web for fine-grained semantic verb relations

From Cohen Courses
Jump to navigationJump to search

Reviews of this paper

Citation

Timothy Chklovski, Patrick Pantel, VerbOcean:Mining the Web for Fine-Grained Semantic Verb Relations. EMNLP 2004: 33-40

Online version

This papeer is available here:[1].

Summary

This paper addresses problem of finding relations between two verbs. Relations considered are similarity, strength, antonymy, enablement, and temporal relations. They use lexical surface patterns to find potential verb pairs for relations and then rank them using mutual information based measure. Their measure differs for symmetric and asymmetric relations. An accuracy of 65.5% is obtained in assigning similarity, strength, antonymy, enablement, and happens-before relations on a set of 29,165 associated verb pairs.

Brief description of the method

The objective of this paper is to predict relation between two verbs.

Following steps are used to predict the relation:

1. Use surface patterns to identify potential verb pairs for relation. For example, "X ie Y" surface pattern is used to find all X and Y which are similar to each other. They have more than once surface pattern for each of six relations considered.

2. Then they adopt an approach motivated by mutual information to measure the strength of association between verbs for the relation. Strength for symmetric relation is given by:


Symetricrelation.png


whereas Strength for asymmetric relation is given by:


Asymetricrelation.png


They query Google using surface patterns to find potential verb pairs for relations. for string denotes number of documents returned from Google when is queried. is set to 8.5. is the number of words indexed by Google.

For asymmetric relations one more test is done by comparing to . Only if the ratio of these two terms is more than certain threshold ( here set to 5) relation between and is considered.


In final step, they prune identified relations.

Surface Patterns

SurfacePatterns.png

Experimental Result

They randomly selected 100 verb pairs and presented them to two human judges. In baseline system most frequent relation is considered as the relation between verb pairs. The most frequent relation is similar and it occurs 33 out of 100 times. For their method, kappa of 0.78 is obtained for the task of judging system tags as correct and incorrect. The overall accuracy of the system is 65.5%. Realtion wise accuracy is as follows:

RelationWiseAccuracy.png

Related Papers

1. Lin, D.; Zhao, S.; Qin, L.; and Zhou, M. 2003. Identifying synonyms among distributionally similar words. In Proceedings of IJCAI-03.

2. Pantel, P. and Ravichandran, D. 2004. Automatically labeling semantic classes. In Proceedings of HLT/NAACL-2004