LoBue and Yates, ACL 2011

From Cohen Courses
Jump to navigationJump to search

Citation

LoBue, Peter and Yates, Alexander. 2011. Types of Common-Sense Knowledge Needed for Recognizing Textual Entailment. In Proceedings of the 49th annual meeting of the Association for Computational Linguistics, 329-334.

Online Version

http://clair.si.umich.edu/clair/anthology/query.cgi?type=Paper&id=P11-2057

Summary

This was a short Paper at ACL 2011 that was a study of the deficiencies of modern systems that try to solve the problem of recognizing textual entailment. That is, given a piece of text, decide if a particular conclusion is justified. This is structured prediction because a system that can recognize textual entailment must include some kind of information extraction, among other things.

In order to solve the textual entailment problem, one must be able to extract information from a document and reason about what conclusions can be drawn from the text. That gets to be very difficult for a machine because text written by humans has a large body of human knowledge that is implicitly understood and never stated. This paper examines a prominent data set for the task to categorize the kinds of knowledge that a machine would have to know in order to solve the problem correctly.

Their method for creating the categories of knowledge is just an intensive analysis by a graduate student. Their method of judging the usefulness and accuracy of those categories is testing the inter-annotator agreement between 5 non-technical undergraduate students who label the data with the kind of knowledge needed.

Experimental Result

The found that their categories of knowledge were reasonably reproducible, with an average Fleiss's kappa of 0.678. The most prominent categories of required background knowledge are of geography, functionality (e.g., a person has only one father), definitions (e.g., "octuplets" are eight children), and preconditions (e.g., naturalized citizens of a country were not born there).

This result is interesting because it highlights areas that information extraction and knowledge base building needs to improve in order to better reason about the implicit knowledge in natural language.

Related Papers

Coreference Resolution in a Modular, Entity-Centered Model, by Haghighi & Klein, 2010, is related to this paper in that in order to solve coreference problems one needs to reason about the text. Haghighi & Klein accomplish this through some syntactic rules and through learning distributions over noun phrases that are used to describe entities. If they had also incorporated facts about those entities, they probably would have done better.

Rahman and Ng, ACL 2011 use features from knowledge sources in order to improve coreference resolution, a related task.