Difference between revisions of "S. Patwardhan and E. Riloff. EMNLP 2009"

From Cohen Courses
Jump to navigationJump to search
Line 30: Line 30:
 
its local context, the authors used [[UsesMethod::Naive Bayes classifier]].
 
its local context, the authors used [[UsesMethod::Naive Bayes classifier]].
 
The features include lexical matches, semantic features, and syntactic relations.
 
The features include lexical matches, semantic features, and syntactic relations.
 +
 +
the MUC-4 terrorism corpus and ProMed disease outbreaks corpus
 +
For the event that often discussed later in a document, far
 +
removed from the main event description,
 +
sentential event recognizer tends to generate low
 +
probabilities for such sentences

Revision as of 04:36, 30 November 2010

Citation

S. Patwardhan and E. Riloff. A unified model of phrasal and sentential evidence for information extraction. in EMNLP 2009

Online version

Unified model for IE

Summary

Previous IE systems make decision only based on immediate context around a phrase. The authors argue that for more complex tasks, such as event extraction, a larger field of view is often needed to understand how facts tie together. This paper proposed a new model for event extraction. To determine whether a noun phrase should be extracted as a filler for an event role the new model computes the joint probability that NPi :

  1. appears in an event sentence, and
  2. is a legitimate filler for the event role.

To compute the probability of a sentence describing a relevant event, they use SVM, which is not a probabilistic classifier. The authors used the margin as an indicator of confidence. It worked well for them. Named entities, lexico-syntactic pattern features, sentence length, bag of words, and verb tense are used as features.

To determine whether a noun phrase can be a legitimate filler for a specific type of event role based on its local context, the authors used Naive Bayes classifier. The features include lexical matches, semantic features, and syntactic relations.

the MUC-4 terrorism corpus and ProMed disease outbreaks corpus For the event that often discussed later in a document, far removed from the main event description, sentential event recognizer tends to generate low probabilities for such sentences