Difference between revisions of "Ling, X. and Weld, D. Temporal Information Extraction. AAAI-10"

Latest revision as of 00:46, 3 November 2011

Reviews of this paper

Citation

Temporal Information Extraction, by X. Ling, D.S. Weld. In Proceedings of the Twenty Fifth National Conference on Artificial Intelligence, 2010.

Online version

This paper is available online [1].

Summary

This paper addresses the problem of Temporal ordering. Unlike other works that focus on time-agnostic extraction of facts, this paper presents TIE, a pipeline system that tries to extract as many facts from text while also inducing as much temporal information as possible. Temporal relations between events and times in a sentence are identified while transitivity is enforced to bind the start and ending time points for each event. The paper also presents temporal entropy as a way to evaluate recall of temporal information extraction systems.

Brief description of the method

The objective of this paper is given a sentence, output (1) a maximal set of temporal elements that are either events (e.g. "officers were dispatched") or temporal references (e.g. "1999") and (2) the tightest set of temporal constraints as directly implied by the text.

The constraints are linear inequalities of the form $p_{1}\leq p_{2}+d\,\!$ where $d\,\!$ is a duration (often zero) and $p_{1}\,\!$ and $p_{2}\,\!$ denotes either the beginning ( $^{\triangleleft }e\,\!$ ) or ending time point ( $e^{\triangleright }\,\!$ ) of a temporal element $e\,\!$ .

For example, the sentence 'Steve Jobs revealed the iPhone in 2007' might produce these constraints:

^{\triangleleft }Year-2007\leq ^{\triangleleft }Reveal(Jobs,iPhone)\,\!

Reveal(Jobs,iPhone)^{\triangleright }\leq Year-2007^{\triangleright }\,\!

The paper introduces a pipeline system called TIE with these components:

Preprocessing: (1) parse the sentence using a syntactic parser (the Stanford parser), (2) detect semantic roles for verbs in the sentence using SRL, (3) use TARSQI to find temporal elements (events and times) in the sentence, (4) generate features for each temporal element as well as between elements.

The features include descriptional features (tense, grounded time values, etc) generated for each element by TARSQI and syntactic features (dependency features and SRL features).

Classification: use probabilistic model trained on TimeBank Corpus combined with transitivity rules to classify each pair of points ( $^{\triangleleft }e\,\!$ or $e^{\triangleright }\,\!$ ) of elements the point-wise relation.

Markov Logic Networks is used to make predictions. Formula weights are learned from the TimeBank data. Some of the formula templates are as follows:

after(p,q)\land after(q,r)\to after(p,r)\,\!

srl\_after(p,q)\to after(p,q)\,\!

where the first formula is transitivity and the second formula integrates the temporal information provided by SRL.

Experimental Result

The experiment is conducted on sentences extracted from Wikipedia articles. The ground truth is constructed by asking human subjects to label the temporal relations between all pairs of elements by comparing their start and ending points. Each pair is labeled by at least two people with a third person to resolve any disagreement.

Performance of TIE is measured using precision that measures the number of correct predictions over all predictions made in test data and temporal entropy (TE) that measures the number and "tightness" of the induced temporal bounds. A human or a system can give time bounds to the starting and end points of the element in the text. Temporal entropy of a time point is thus measured as the log of the length of the tightest bound produced for the time point. The lower the entropy, the better (tighter) is the bound.

The precision and TE of TIE are compared to three other systems: (1) PASCA, a pattern-based extraction system for question answering, (2) SRL, (3) TARSQI. In terms of precision, TIE is able to extract more correct constraints at comparable precision to PASCA and SRL. In terms of TE, TIE has temporal entropy that is the closest to a human's skill. From the experiments, it is also observed that SRL feature improves the precision of TIE while transitivity feature improves the recall of the system. On TempEval task, TIE is able to achieve an accuracy of 0.695 which is comparable to that of the current state of the art, 0.716 (Yoshikawa et al. 2009).

Discussion

This paper addresses the problem of temporal information extraction and Temporal ordering by using an assembly of existing systems in a pipeline called TIE to extract temporal elements and their features from text and Markov Logic Networks to classify the temporal constraints between the starting or ending time points of these extracted elements. The system also proposes a measure called temporal entropy to measure the number and the tightness of the constraints produced.

TIE is shown to outperform the other three approaches: PASCA, SRL, TARSQI in terms of precision and temporal entropy.

Despite the good performance results, the contribution of this paper to the area of temporal ordering and temporal information extraction seems rather minimal as it simply uses an array of already existing systems. The idea of using Markov Logic Network for temporal ordering is also not new and it (Yoshikawa et al. 2009) even outperforms the system proposed in this paper in terms of accuracy. The main argument for the paper is that it predicts a different temporal ordering than that of other systems. Instead of classifying temporal relations between elements into a set of classes: BEFORE, AFTER, OVERLAP; this paper classifies the temporal relations between the elements' starting or ending time points.

Related papers

A related paper is Yoshikawa 2009 jointly identifying temporal relations with markov logic that also uses Markov Logic Networks to do a softer (non-deterministic) joint inference for temporal ordering of events and times.

Another related paper is Denis and Muller, Predicting Globally-Coherent Temporal Structures from Texts via Endpoint Inference and Graph Decomposition, IJCAI 2011 which, similar to this paper, attempts to classify all types of temporal relations (not just before/after) in TimeBank Corpus by translating these temporal interval relations to their end points (starting and ending).

@@ Line 18: / Line 18: @@
 The objective of this paper is given a sentence, output (1) a maximal set of temporal elements that are either events (e.g. "officers were dispatched") or temporal references (e.g. "1999") and (2) the tightest set of temporal constraints as directly implied by the text.
-The constraints are linear inequalities of the form <math>p_1 \le p_2 + d \,\!</math> where <math>p_1 \,\!</math> and <math>p_2 \,\!</math> denotes either the beginning (<math>^\triangleleft e \,\!</math>) or ending time point (<math>e^\triangleright \,\!</math>) of a temporal element <math>e \,\!</math>.
+The constraints are linear inequalities of the form <math>p_1 \le p_2 + d \,\!</math> where <math>d \,\!</math> is a duration (often zero) and <math>p_1 \,\!</math> and <math>p_2 \,\!</math> denotes either the beginning (<math>^\triangleleft e \,\!</math>) or ending time point (<math>e^\triangleright \,\!</math>) of a temporal element <math>e \,\!</math>.
 For example, the sentence 'Steve Jobs revealed the iPhone in 2007' might produce these constraints:
-::<math>^\triangleleft Year-2007 \le ^\triangleleft Reveal(Jobs, iPhone) \,\! </math>
+:<math>^\triangleleft Year-2007 \le ^\triangleleft Reveal(Jobs, iPhone) \,\! </math>
-::<math>Reveal(Jobs, iPhone)^\triangleright \le Year-2007^\triangleright \,\! </math>
+:<math>Reveal(Jobs, iPhone)^\triangleright \le Year-2007^\triangleright \,\! </math>
 The paper introduces a pipeline system called '''TIE''' with these components:
-* '''Preprocessing''': (1) parse the sentence using a syntactic parser ([http://nlp.stanford.edu/software/lex-parser.shtml the Stanford parser]), (2) detect semantic roles for verbs in the sentence using [http://acl.ldc.upenn.edu/W/W05/W05-06.pdf#page=195 SRL], (3) use [http://timeml.org/site/tarsqi/index.html TARSQI] to find temporal elements (events and times) in the sentence, (4) generate features for each temporal element as well as between elements. The features include descriptional features (tense, grounded time values, etc) generated for each element by TARSQI and syntactic features (dependency features and SRL features).
+* '''Preprocessing''': (1) parse the sentence using a syntactic parser ([http://nlp.stanford.edu/software/lex-parser.shtml the Stanford parser]), (2) detect semantic roles for verbs in the sentence using [http://acl.ldc.upenn.edu/W/W05/W05-06.pdf#page=195 SRL], (3) use [http://timeml.org/site/tarsqi/index.html TARSQI] to find temporal elements (events and times) in the sentence, (4) generate features for each temporal element as well as between elements.
+:The features include descriptional features (tense, grounded time values, etc) generated for each element by TARSQI and syntactic features (dependency features and SRL features).
 * '''Classification''': use probabilistic model trained on [[UsesDataset::TimeBank Corpus]] combined with transitivity rules to classify each pair of points (<math>^\triangleleft e \,\!</math> or <math>e^\triangleright \,\!</math>) of elements the point-wise relation.
+:[[UsesMethod::Markov Logic Networks]] is used to make predictions. Formula weights are learned from the TimeBank data. Some of the formula templates are as follows:
-== Experimental Result ==
+:<math>after(p,q) \and after(q,r) \to after(p,r) \,\!</math>
+:<math>srl\_after(p,q) \to after(p,q) \,\!</math>
-The experiment focuses on events about people as entities. A list of entities involving people are obtained from [http://www.wikipedia.org/ Wikipedia]. Two evaluation approach are done: (1) entity-based and (2) list-based.
+:where the first formula is transitivity and the second formula integrates the temporal information provided by SRL.
-In the entity-based evaluation, a sample of 30 entities is taken from the list of entities. Gold-set of events about these entities are constructed by examining various web-based sources (Wikipedia, official home pages, news search and web search) to identify all events involving each entity in the sample. For each discovered event, the (1) time period for the event and (2) all other entities involved in the event are recorded. Precision and recall values are measured for matching DROP and Gold-set events: i.e. events that occur in the same time period and have at least one entities in common.
+== Experimental Result ==
-Precision are measured by the fraction of entities in DROP events that are participants of the gold events. Recall are measured by the fraction of entities in gold events that are found in the DROP events.
+The experiment is conducted on sentences extracted from [http://www.cs.washington.edu/ai/iwp/tie.html Wikipedia] articles. The ground truth is constructed by asking human subjects to label the temporal relations between all pairs of elements by comparing their start and ending points. Each pair is labeled by at least two people with a third person to resolve any disagreement.
-In the list-based evaluation, a sample of 35 events produced by each method ('''LTC''', '''GTC''', '''TRJ''': simple, time-agnostic clustering of connected vertices in the '''PTG''' graph) is taken. For each sample event and its participating entities, a real-world event that best explains the co-occurrence of these entities for the specified time period is identified. A gold event is a subset of entities within the sample event that actually participated in the real-world event. Precision and recall is again measured like in the entity-based evaluation.
+Performance of '''TIE''' is measured using precision that measures the number of correct predictions over all predictions made in test data and ''temporal entropy'' (TE) that measures the number and "tightness" of the induced temporal bounds. A human or a system can give time bounds to the starting and end points of the element in the text. Temporal entropy of a time point is thus measured as the log of the length of the tightest bound produced for the time point. The lower the entropy, the better (tighter) is the bound.
-In the entity-based evaluation, '''TRJ''' has good precision and recall because it produces large, time-agnostic clusters (events) that basically involves most of the entities. Therefore, its recall is high because its large events contain all the entities in the gold events. Its precision is also high because it produces large events and some very small ones. Although the large events have low precision, the small ones have high precision; taking the average results in an overall high precision. However, the events produced by '''TRJ''' are not meaningful as basically it simply dumps most of the entities which may or may not be temporally related in a cluster.
+The precision and TE of '''TIE''' are compared to three other systems: (1) [http://aclweb.org/anthology/I/I08/I08-1054.pdf PASCA], a pattern-based extraction system for question answering, (2) [http://acl.ldc.upenn.edu/W/W05/W05-06.pdf#page=195 SRL], (3) [http://timeml.org/site/tarsqi/index.html TARSQI]. In terms of precision, '''TIE''' is able to extract more correct constraints at comparable precision to PASCA and SRL. In terms of TE, '''TIE''' has temporal entropy that is the closest to a human's skill. From the experiments, it is also observed that SRL feature improves the precision of '''TIE''' while transitivity feature improves the recall of the system. On [http://www.timeml.org/tempeval/ TempEval] task, '''TIE''' is able to achieve an accuracy of 0.695 which is comparable to that of the current state of the art, 0.716 ([[RelatedPaper::Yoshikawa 2009 jointly identifying temporal relations with markov logic |Yoshikawa et al. 2009]]).
-In the list-based evaluation, '''LTC''' and '''GTC''' outperform '''TRJ''' in both precision and recall. Both '''LTC''' and '''GTC''' exhibit around 21% higher precision and 47% and 24% higher recall respectively than that of '''TRJ''' that has lower precision due to its large clusters: hence the fraction of participants that are actually involved in real-world events are low. '''TRJ''' has also lower recall because it produces fewer events.
+== Discussion ==
-== Discussion ==
+This paper addresses the problem of [[AddressesProblem::temporal information extraction]] and [[AddressesProblem::Temporal ordering]] by using an assembly of existing systems in a pipeline called '''TIE''' to extract temporal elements and their features from text and [[UsesMethod::Markov Logic Networks]] to classify the temporal constraints between the starting or ending time points of these extracted elements. The system also proposes a measure called ''temporal entropy'' to measure the number and the tightness of the constraints produced.
-This paper presents an interesting take on the problem of temporal information extraction by finding dynamic events that are not predefined by any schema, rather is a result of entities that co-bursting together in a time period. However, a reader cannot help but question if the paper's assumption is valid that ''entities which co-burst together in time indeed have a relationship''. In the case of celebrities for example, as they mention in the paper, some celebrities appear to be co-occurring a lot together in documents even though they are not involved in any relationships. Buzzy entities (entities that appear a lot in documents no matter when) may also cause an issue because they will appear to co-burst with many other unrelated entities. A better evaluation of the approach may be necessary. The fact that '''TRJ''', which produces less than useful events, results in higher precision and recall in their entity-based evaluation also highlights this need to have a better evaluation approach and measure.
+'''TIE''' is shown to outperform the other three approaches: PASCA, SRL, TARSQI in terms of precision and temporal entropy.
-Another drawback to this paper is the lack of standardized data set for training and testing the approach. The choice of entities to list and events to sample seem arbitrary: ''take a random 30 entities and a random 35 entities'' (why 30 and 35?). The novelty of the task that this paper is addressing may cause finding standard data set difficult. In future a better and more standardized data set could be created to better train, test, and compare approaches in this area.
+Despite the good performance results, the contribution of this paper to the area of temporal ordering and temporal information extraction seems rather minimal as it simply uses an array of already existing systems. The idea of using Markov Logic Network for temporal ordering is also not new and it ([[RelatedPaper::Yoshikawa 2009 jointly identifying temporal relations with markov logic |Yoshikawa et al. 2009]]) even outperforms the system proposed in this paper in terms of accuracy. The main argument for the paper is that it predicts a different temporal ordering than that of other systems. Instead of classifying temporal relations between elements into a set of classes: BEFORE, AFTER, OVERLAP; this paper classifies the temporal relations between the elements' starting or ending time points.
 == Related papers ==
-Due to the novelty of the problem that this paper tries to address, there are still few works that are closely related to it. Perhaps the closest work is by [[RelatedPaper::Q. Zhao, P. Mitra, and B. Chen. Temporal and information flow based event detection from social text streams. In AAAI, 2007]] where events are detected using keywords and a variety of signals including temporal and social constraints.
+A related paper is [[RelatedPaper::Yoshikawa 2009 jointly identifying temporal relations with markov logic]] that also uses Markov Logic Networks to do a ''softer'' (non-deterministic) joint inference for temporal ordering of events and times.
-Although the paper is novel in terms of its take on temporal information extraction, the idea of using co-occurrence to identify relationships among entities are widespread in other applications. This idea of using [[RelatedPaper::Pereira et.al. Distributional Clustering Of English Words, ACL 1993|distributional clustering]] has been used in [[RelatedPaper::Banko_2007_Open_Information_Extraction_from_the_Web]] to mine co-occurring entities from large collection of documents while being agnostic to the real world relations causing the co-occurrence. In this paper, instead of document co-occurrence, temporal co-occurrence is used to mine related entities.
-Other related work that uses distributional clustering to discover structure in documents is [[RelatedPaper::Chambers, N. and Jurafsky, D. Template-based information extraction without the templates, ACL 2011]]. Although the relationship is not immediately clear (due to the non-temporal nature of the paper), this paper uses very similar ideas of co-occurrence and clustering to discover event templates in documents. It may be interesting to draw the similarities and differences between the two papers.
+Another related paper is [[RelatedPaper::Denis and Muller, Predicting Globally-Coherent Temporal Structures from Texts via Endpoint Inference and Graph Decomposition, IJCAI 2011]] which, similar to this paper, attempts to classify all types of temporal relations (not just ''before''/''after'') in TimeBank Corpus by translating these temporal interval relations to their end points (starting and ending).

Difference between revisions of "Ling, X. and Weld, D. Temporal Information Extraction. AAAI-10"

Latest revision as of 00:46, 3 November 2011

Contents

Reviews of this paper

Citation

Online version

Summary

Brief description of the method

Experimental Result

Discussion

Related papers

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools