Chklovski and Pantel (2004) Verbocean:Mining the web for fine-grained semantic verb relations

From Cohen Courses
Revision as of 00:48, 6 November 2012 by Mmahavee (talk | contribs)
Jump to navigationJump to search

Reviews of this paper

Citation

Timothy Chklovski, Patrick Pantel, VerbOcean:Mining the Web for Fine-Grained Semantic Verb Relations. EMNLP 2004: 33-40

Online version

This paper is available online [1].

Summary

This paper addresses problem of finding relations between two verbs. Relations considered are similarity, strength, antonymy, enablement, and temporal relations. They use lexical patterns to find potential verb pairs for relations and then rank them using mutual information based measure. Their measure differs for symmetric and asymmetric relations. An accuracy of 65.5% is obtained in assigning similarity, strength, antonymy, enablement, and happens-before relations on a set of 29,165 associated verb pairs.

Brief description of the method

The objective of this paper is to predict relation between two verbs.

Following steps are used to predict the relation:

1. Use surface patterns to identify potential verb pairs for relation. For example, "X ie Y" surface pattern is used to find all X and Y which are similar to each other. They have more than once surface pattern for each of six relations considered.

2. Then they adopt an approach motivated by mutual information to measure the strength of association between verbs for the relation. Strength for symmetric relation is given by:


Symetricrelation.png


whereas Strength for asymmetric relation is given by:


Asymetricrelation.png


They query Google using surface patterns to find potential verb pairs for relations. for string denotes number of documents returned from Google when is queried. is set to 8.5. is the number of words indexed by Google.

For asymmetric relations one more test is done by comparing to . Only if the ratio of these two terms is more than certain threshold ( here set to 5) relation between and is considered.

Surface Patterns

SurfacePatterns.png

Experimental Result

The experiment is conducted on documents from the Gigaword corpus. The temporal classifier is trained on TimeBank Corpus. For a document, the protagonist is defined as an entity that is mentioned the most number of times in the document. All the methods that follow are built around this protagonist.

In terms of extracting related events, the proposed method shows a 36% improvement over baseline that learns relatedness strictly based upon verb co-occurrence (PMI is computed between all occurrences of two verbs in the same document, without requiring the verbs to share a common protagonist). The proposed method also shows 25% improvement for temporal ordering over random ordering of the connected events.

An example of an extracted narrative chain (in this case a possible Prosecution chain is shown below, where the arrow indicates the before relation.

NarrativeChain.png

Discussion

This paper addresses the problem of unsupervised extraction of narrative event chains. The paper is interesting and novel in its idea of using a common protagonist to gather a set of related narrative events and temporally order them to form a narrative event chain. The implication of this work is multifolds. For example, this automatic extraction of event chains from documents can be used for automatic template construction and template filling (as in Chambers and Jurafsky, ACL 2010), or to inform better automatic understanding of documents. However, the paper focuses only on just one participant of the narrative event chain. In reality, a chain of events may involve more than one participants. The paper therefore only presents one view point of an event. It will be interesting to see if the method can therefore be used to find and extract different view points of the same event chain from the different participants.

Related papers

A related paper is Bean and Riloff (2004) Unsupervised Learning of contextual role knowledge for coreference resolution that proposes the use of caseframe networks (pairs of related caseframes (a verb/event and a semantic role) that indicate synonymy or relatedness, for example '<patient> kidnapped' and '<patient> released' are related caseframes). This paper generalizes the use of these caseframes to find entire set of events (verbs) anchored on the same argument rather than just pairs of related frames.

Another related paper is Chklovski and Pantel (2004) Verbocean:Mining the web for fine-grained semantic verb relations which, similar to this paper, is using pointwise mutual information to automatically find relations between verbs. This work is different in that it uses a protagonist as the indicator of relatedness between the verbs.

Most recent related paper is Chambers and Jurafsky, ACL 2010 which finds clusters of verbs that are sharing arguments to define templates automatically from documents.