Difference between revisions of "Chambers and Jurafsky, Unsupervised Learning of Narrative Event Chains, ACL 2008"

Revision as of 01:34, 29 November 2011

Reviews of this paper

Citation

Unsupervised Learning of Narrative Event Chains, by N. Chambers, D. Jurafsky. In Proceedings of ACL-08: HLT, 2008.

Online version

This paper is available online [1].

Summary

This paper addresses the problem of unsupervised extraction of narrative event chains, which are partially ordered sets of narrative events centered around a common protagonist. To extract narrative event chains, the paper assumes two things:

narrative coherence: verbs sharing co-referring arguments are semantically connected by virtue of narrative discourse structure
the protagonist: although a narrative has several participants, there is a central actor who characterizes a narrative chain, which is the protagonist

Narrative chains are somewhat related to structured sequences of participants and events that are called scripts (Schank and Abelson, 1977).

As covered in the guest lecture by Brendan O'Connor, scripts were central to natural language understanding research in the 1970s and 1980s for tasks such as question answering, story understanding, summarization, etc. For example, Schank and Abelson (1977)'s restaurant script which is proposed to understand text about restaurants based on the participants (customer, waiter, etc.), their actions or movements (entering, sitting down, ordering, ingesting, etc), and effects resulting from each actions which can be seen as a collection of events with some sort of (temporal) ordering. These scripts or template (or "sketchy scripts" in the case of DeJong's FRUMP system, 1982) were manually written. The purpose of this paper is to learn such "scripts" from a collection of documents automatically.

Brief description of the method

The objective of this paper is to automatically extract narrative event chains, which are collections of events that are sharing a common protagonist and are temporally ordered.

The paper proposes three steps process to learning narrative event chains:

Narrative event induction: using unsupervised distributional method by computing pointwise mutual information of events that share co-referring arguments
Temporal ordering of events: using Support Vector Machines to classify partial temporal ordering of the connected events
Structured selection: using clustering to extract self-contained chains from the space of events

In the first step, relatedness between pairs of narrative events are first computed using pointwise mutual information:

pmi(e(w,d),e(v,g))=log{\frac {P(e(w,d),e(v,g))}{P(e(w,d))P(e(v,g))}}\,\!

P(e(w,d),e(v,g))={\frac {C(e(w,d),e(v,g))}{\sum _{x,y}\sum _{d,f}C(e(x,d),e(y,f))}}\,\!

where e(w,d) is a narrative event (e.g. e(push,patient)) and C(e(x,d),e(y,f)) is the number of times two narrative events e(x,d) and e(y,f) share a co-referring argument filling the values of the typed dependencies d and f.

Thus, given a chain of narrative events, a most likely event to be put in the chain can be computed by:

max_{j:0<j<m}\sum _{i=0}^{n}pmi(e_{i},f_{j})\,\!

where n is the number of events already in the chain, $\textstyle e_{i}\,\!$ is the i th event, and m is the number of events f available to be selected next. An unordered set of narrative events related by a common protagonist can therefore be extracted using this method.

In the second step, given the unordered set of narrative events, the paper finds temporal ordering of the events. The paper uses two step-appproach as in Chambers et.al. (2007) Classifying temporal relations between events:

use SVM to label the temporal attributes of events such as tense, grammatical aspect, etc
use SVM with attributes predicted in the first step combined with other linguistic features; to classify the temporal relations between events

In the third step, an agglomerative clustering using PMI scores from the first step is used to find sets of narrative events. Then, the ordering relations from the second step is used to produce a directed graph in these sets of narrative events.

Experimental Result

The experiment is conducted on documents from the Gigaword corpus. The temporal classifier is trained on TimeBank Corpus. For a document, the protagonist is defined as an entity that is mentioned the most number of times in the document. All the methods that follow are built around this protagonist.

In terms of extracting related events, the proposed method shows a 36% improvement over baseline that learns relatedness strictly based upon verb co-occurrence (PMI is computed between all occurrences of two verbs in the same document, without requiring the verbs to share a common protagonist). The proposed method also shows 25% improvement for temporal ordering over random ordering of the connected events.

An example of an extracted narrative chain (in this case a possible Prosecution chain is shown below, where the arrow indicates the before relation.

Discussion

This paper addresses the problem of unsupervised extraction of narrative event chains. The paper is interesting and novel in its idea of using a common protagonist to gather a set of related narrative events and temporally order them to form a narrative event chain. The implication of this work is multifolds. For example, this automatic extraction of event chains from documents can be used for automatic template construction and template filling (as in Chambers and Jurafsky, ACL 2010), or to inform better automatic understanding of documents. However, the paper focuses only on just one participant of the narrative event chain. In reality, a chain of events may involve more than one participants. The paper therefore only presents one view point of an event. It will be interesting to see if the method can therefore be used to find and extract different view points of the same event chain.

Related papers

A related paper is Bean and Riloff (2004) Unsupervised Learning of contextual role knowledge for coreference resolution that proposes the use of caseframe networks (pairs of related caseframes (a verb/event and a semantic role) that indicate synonymy or relatedness, for example '<patient> kidnapped' and '<patient> released' are related caseframes). This paper generalizes the use of these caseframes to find entire set of events (verbs) anchored on the same argument rather than just pairs of related frames.

Another related paper is Chklovski and Pantel (2004) Verbocean:Mining the web for fine-grained semantic verb relations which, similar to this paper, is using pointwise mutual information to automatically find relations between verbs. This work is different in that it uses a protagonist as the indicator of relatedness between the verbs.

Most recent related paper is Chambers and Jurafsky, ACL 2010 which finds clusters of verbs that are sharing arguments to define templates automatically from documents.

@@ Line 47: / Line 47: @@
 In the second step, given the unordered set of narrative events, the paper finds temporal ordering of the events. The paper uses two step-appproach as in [[RelatedPaper::Chambers et.al. (2007) Classifying temporal relations between events]]:
-* SVM to label the temporal attributes of events such as tense, grammatical aspect, etc
+* use SVM to label the temporal attributes of events such as tense, grammatical aspect, etc
-* using attributes predicted in the first step combined with other linguistic features, SVM to classify the temporal relations between events
+* use SVM with attributes predicted in the first step combined with other linguistic features; to classify the temporal relations between events
 In the third step, an agglomerative clustering using PMI scores from the first step is used to find sets of narrative events. Then, the ordering relations from the second step is used to produce a directed graph in these sets of narrative events.
@@ Line 54: / Line 54: @@
 == Experimental Result ==
-The experiment is conducted on documents from the [[UsesDataset::Gigaword corpus]]. The temporal classifier is trained on [[UsesDataset::TimeBank Corpus]].
+The experiment is conducted on documents from the [[UsesDataset::Gigaword corpus]]. The temporal classifier is trained on [[UsesDataset::TimeBank Corpus]]. For a document, the protagonist is defined as an entity that is mentioned the most number of times in the document. All the methods that follow are built around this protagonist.
 In terms of extracting related events, the proposed method shows a 36% improvement over baseline that learns relatedness strictly based upon verb co-occurrence (PMI is computed between all occurrences of two verbs in the same document, without requiring the verbs to share a common protagonist). The proposed method also shows 25% improvement for temporal ordering over random ordering of the connected events.
-== Discussion ==
+An example of an extracted narrative chain (in this case a possible ''Prosecution chain'' is shown below, where the arrow indicates the ''before'' relation.
-This paper addresses the problem of [[AddressesProblem::temporal information extraction]] and [[AddressesProblem::Temporal ordering]] by using an assembly of existing systems in a pipeline called '''TIE''' to extract temporal elements and their features from text and [[UsesMethod::Markov Logic Networks]] to classify the temporal constraints between the starting or ending time points of these extracted elements. The system also proposes a measure called ''temporal entropy'' to measure the number and the tightness of the constraints produced.
+[[File:NarrativeChain.png]]
-'''TIE''' is shown to outperform the other three approaches: PASCA, SRL, TARSQI in terms of precision and temporal entropy.
+== Discussion ==
-Despite the good performance results, the contribution of this paper to the area of temporal ordering and temporal information extraction seems rather minimal as it simply uses an array of already existing systems. The idea of using Markov Logic Network for temporal ordering is also not new and it ([[RelatedPaper::Yoshikawa 2009 jointly identifying temporal relations with markov logic |Yoshikawa et al. 2009]]) even outperforms the system proposed in this paper in terms of accuracy. The main argument for the paper is that it predicts a different temporal ordering than that of other systems. Instead of classifying temporal relations between elements into a set of classes: BEFORE, AFTER, OVERLAP; this paper classifies the temporal relations between the elements' starting or ending time points.
+This paper addresses the problem of unsupervised extraction of [[AddressesProblem::narrative event chains]]. The paper is interesting and novel in its idea of using a common protagonist to gather a set of related narrative events and temporally order them to form a narrative event chain. The implication of this work is multifolds. For example, this automatic extraction of event chains from documents can be used for automatic template construction and template filling (as in [[RelatedPaper::Chambers and Jurafsky, ACL 2010]]), or to inform better automatic understanding of documents. However, the paper focuses only on just one participant of the narrative event chain. In reality, a chain of events may involve more than one participants. The paper therefore only presents one view point of an event. It will be interesting to see if the method can therefore be used to find and extract different view points of the same event chain.
 == Related papers ==

Difference between revisions of "Chambers and Jurafsky, Unsupervised Learning of Narrative Event Chains, ACL 2008"

Revision as of 01:34, 29 November 2011

Contents

Reviews of this paper

Citation

Online version

Summary

Brief description of the method

Experimental Result

Discussion

Related papers

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools