Difference between revisions of "Palmer et al Computational Linguistics 2005"

From Cohen Courses
Jump to navigationJump to search
 
(2 intermediate revisions by the same user not shown)
Line 9: Line 9:
 
== Summary ==
 
== Summary ==
  
The [[Category::paper]] presents how they built The Proposition Bank ([[PropBank]]) corpus, and the paper describes an automatic system for [[AddressesProblem::Semantic Role Labeling]] trained on the [[UsesDataset::PropBank]] corpus.
+
The [[Category::paper]] presents how they built The Proposition Bank ([[PropBank]]) corpus, and describes an automatic system for [[AddressesProblem::Semantic Role Labeling]] trained on the [[UsesDataset::PropBank]] corpus.
  
  
For automatic determination of semantic role labels, they adopted the features and probability model of [[Gildea and Jurafsky Computational Linguistics 2002]] to the PropBank for their experiments. While [[Gildea and Jurafsky Computational Linguistics 2002]] do not have a gold standard of parse tree, they do have a gold standard of parse trees, and they show improvements in the performance of the system.  
+
For automatic determination of semantic role labels, they adopted the features and probability model of [[RelatedPaper::Gildea and Jurafsky Computational Linguistics 2002]] to the PropBank for their experiments. While [[Gildea and Jurafsky Computational Linguistics 2002]] do not have a gold standard of parse tree, they do have a gold standard of parse trees, and they show improvements in the performance of the system.  
  
  
Line 30: Line 30:
  
 
== Key Contribution ==  
 
== Key Contribution ==  
This system is the first statistical model on FrameNet solving the semantic role labeling problem, and future systems use the features introduced in this paper as a baseline. This paper is also very worth to read in that it describes the whole process of semantic role labeling in detail. In addition, they did many various experiments to find out which features, algorithms, and techniques affect the performance of the system.
+
This paper shows the process to build the PropBank corpus, which is one of the most representative corpora for semantic role labeling, and tests the first statistical model of [[Gildea and Jurafsky Computational Linguistics 2002]] on the corpus. This paper is used as a baseline for experiments on the PropBank, too.

Latest revision as of 22:22, 30 November 2010

Citation

Martha Palmer, Daniel Gildea, and Paul Kingsbury. 2005. The Proposition Bank: An Annotated Corpus of Semantic Roles. Computational Linguistics, 31(1):71–106.

Online version

MIT Press

Summary

The paper presents how they built The Proposition Bank (PropBank) corpus, and describes an automatic system for Semantic Role Labeling trained on the PropBank corpus.


For automatic determination of semantic role labels, they adopted the features and probability model of Gildea and Jurafsky Computational Linguistics 2002 to the PropBank for their experiments. While Gildea and Jurafsky Computational Linguistics 2002 do not have a gold standard of parse tree, they do have a gold standard of parse trees, and they show improvements in the performance of the system.


The dataset used has annotations for 72,109 predicate-argument structures containing 190,815 individual arguments and containing examples from 2,462 lexical predicates types.


Features used for the system are the phrase type, the parse tree path, the position, the voice, and the head word.


The system was tested with two purposes: 1. to predict the correct semantic role given the constituents for arguments to the predicate, 2. to both find the arguments in the sentence and predict the correct semantic role.


The system shows 80.9% accuracy in predicting the semantic role of pre-segmented constituents with automatic parses, and shows 82.8% of accuracy with gold-standard parses. It gives 82% precision and 74.7% recall in both finding the arguments and predicting their semantic role.


In addition, they showed that using full parse trees are much more informative and useful than using the chunked representation for labeling semantic roles with 74.3% precision and 66.4% recall, and 49.5% precision and 35.1% recall, respectively.

Key Contribution

This paper shows the process to build the PropBank corpus, which is one of the most representative corpora for semantic role labeling, and tests the first statistical model of Gildea and Jurafsky Computational Linguistics 2002 on the corpus. This paper is used as a baseline for experiments on the PropBank, too.