Difference between revisions of "Chambers and Jurafsky, ACL 2010"
(Created page with ''''Template-Based Information Extraction without the Templates''' is a [[Category::Paper|paper]] by Chambers and Jurafsky which be found [http://www-cs.stanford.edu/people/nc/pub…') |
|||
Line 5: | Line 5: | ||
==Summary== | ==Summary== | ||
+ | This paper proposed a solution finding and filling out [[template|templates]] in an unsupervised manner. Templates are extremely important from [[information extraction]] where they are a step above [[semantic role labelling]]. Templates, in a general sense, indicate what is happening and who is involved. While in most cases templates are either given out (to be filled) or engineered for a specific task, this paper tried to do both unsupervised. | ||
+ | |||
+ | ==Method== | ||
+ | While the actual method the paper followed can appear to be a mishmash of steps, the underlying ideas are quite clear. The first main idea is that events that occupy one template are bound to happen near each other. This means that if the verbs "kidnap" and "taken" are found close together in the corpus, that is a piece of evidence that "kidnap" and "taken" have something to say in the same template. If we find a lot of evidence for certain verbs to be together, then we group them in a template. | ||
+ | |||
+ | The other main idea is that while they need to get the verb relations from the given corpus, there are not enough examples in the corpus to find patterns about the verbs. For this, they expanded the corpus at hand by looking at the New York Times and the Reuters section of the [[Gigaword corpus]]. They only considered text that had to do with a template which they were considering. | ||
+ | |||
+ | ==Related Work== | ||
+ | Other work has been done in finding templates in an unsupervised manner. Good work has been done in [[RelatedPaper::Patwardhan and Riloff, EMNLP 2007]] and [[RelatedPaper::Patwardhan and Riloff, EMNLP 2009]]. |
Revision as of 01:05, 3 November 2011
Template-Based Information Extraction without the Templates is a paper by Chambers and Jurafsky which be found online.
Contents
Citation
Chambers and Jurafsky. Template-Based Information Extraction without the Templates. In ACL 2010.
Summary
This paper proposed a solution finding and filling out templates in an unsupervised manner. Templates are extremely important from information extraction where they are a step above semantic role labelling. Templates, in a general sense, indicate what is happening and who is involved. While in most cases templates are either given out (to be filled) or engineered for a specific task, this paper tried to do both unsupervised.
Method
While the actual method the paper followed can appear to be a mishmash of steps, the underlying ideas are quite clear. The first main idea is that events that occupy one template are bound to happen near each other. This means that if the verbs "kidnap" and "taken" are found close together in the corpus, that is a piece of evidence that "kidnap" and "taken" have something to say in the same template. If we find a lot of evidence for certain verbs to be together, then we group them in a template.
The other main idea is that while they need to get the verb relations from the given corpus, there are not enough examples in the corpus to find patterns about the verbs. For this, they expanded the corpus at hand by looking at the New York Times and the Reuters section of the Gigaword corpus. They only considered text that had to do with a template which they were considering.
Related Work
Other work has been done in finding templates in an unsupervised manner. Good work has been done in Patwardhan and Riloff, EMNLP 2007 and Patwardhan and Riloff, EMNLP 2009.