Chklovski and Pantel (2004) Verbocean:Mining the web for fine-grained semantic verb relations

From Cohen Courses
Jump to navigationJump to search

Reviews of this paper

Citation

Timothy Chklovski, Patrick Pantel, VerbOcean:Mining the Web for Fine-Grained Semantic Verb Relations. EMNLP 2004: 33-40

Online version

This paper is available online [1].

Summary

This paper addresses problem of finding relations between two verbs. Relations considered are similarity, strength, antonymy, enablement, and temporal relations. They use lexical patterns to find potential verb pairs for relations and then rank them using mutual information based measure. Their measure differs for symmetric and asymmetric relations. An accuracy of 65.5% is obtained in assigning similarity, strength, antonymy, enablement, and happens-before relations on a set of 29,165 associated verb pairs.

Brief description of the method

The objective of this paper is to predict relation between two verbs.

Following steps are used to predict the relation:

1. Use surface patterns to identify potential verb pairs for relation. For example, "X ie Y" surface pattern is used to find all X and Y which are similar to each other. They have more than once surface pattern for each of six relations considered.

2. Then they adopt an approach motivated by mutual information to measure the strength of association between verbs for the relation. Strength for symmetric relation is given by:


Symetricrelation.png


whereas Strength for asymmetric relation is given by:


Asymetricrelation.png


They query Google using surface patterns to find potential verb pairs for relations. for string denotes number of documents returned from Google when is queried. is set to 8.5. is the number of words indexed by Google.

For asymmetric relations one more test is done by comparing to . Only if the ratio of these two terms is more than certain threshold ( here set to 5) relation between and is considered.

Surface Patterns

SurfacePatterns.png

Experimental Result

They randomly selected 100 verb pairs and presented them to two human judges. In baseline system most frequent relation is considered as the relation between verb pairs. It occurs 33 times out of 100. Kappa of .78 is obtained for the task of judging system tags as correct and incorrect The overall accuracy of the system is 65.5%.

Discussion

This paper addresses the problem of unsupervised extraction of narrative event chains. The paper is interesting and novel in its idea of using a common protagonist to gather a set of related narrative events and temporally order them to form a narrative event chain. The implication of this work is multifolds. For example, this automatic extraction of event chains from documents can be used for automatic template construction and template filling (as in Chambers and Jurafsky, ACL 2010), or to inform better automatic understanding of documents. However, the paper focuses only on just one participant of the narrative event chain. In reality, a chain of events may involve more than one participants. The paper therefore only presents one view point of an event. It will be interesting to see if the method can therefore be used to find and extract different view points of the same event chain from the different participants.

Related papers

A related paper is Bean and Riloff (2004) Unsupervised Learning of contextual role knowledge for coreference resolution that proposes the use of caseframe networks (pairs of related caseframes (a verb/event and a semantic role) that indicate synonymy or relatedness, for example '<patient> kidnapped' and '<patient> released' are related caseframes). This paper generalizes the use of these caseframes to find entire set of events (verbs) anchored on the same argument rather than just pairs of related frames.

Another related paper is Chklovski and Pantel (2004) Verbocean:Mining the web for fine-grained semantic verb relations which, similar to this paper, is using pointwise mutual information to automatically find relations between verbs. This work is different in that it uses a protagonist as the indicator of relatedness between the verbs.

Most recent related paper is Chambers and Jurafsky, ACL 2010 which finds clusters of verbs that are sharing arguments to define templates automatically from documents.