Difference between revisions of "Riloff et al., 2003"

From Cohen Courses
Jump to navigationJump to search
(Created page with '== Citation == Ellen Riloff, Janyce Wiebe, and Theresa Wilson. Learning Subjective Nouns using Extraction Pattern Bootstrapping. In the ''International Conference on Natural Lan…')
 
 
Line 5: Line 5:
 
== Summary ==
 
== Summary ==
  
== Motivation ==
+
Subjectivity classification is a task which shows large jumps in performance based on keyword lists. Since so much of the information for this particular task is encoded at the lexical level, it is particularly useful as a domain for testing the efficacy of dictionary-building [[UsesMethod::Bootstrapping|bootstrapping]] systems. In this paper, two different systems for bootstrapping, MetaBoot and Basilisk, are compared for accuracy and comprehensiveness. The resulting wordlist, manually validated from the outputs of both bootstrapping systems, is then used to create new features for a subjectivity classifier. It is shown that these simple features, using the wordlists as input, improves performance over the previous state-of-the-art feature space for this task.
  
== Technical Approach ==
+
== Extraction Patterns ==
 +
 
 +
=== MetaBootstrapping ===
 +
Both algorithms presented in this paper begin with a set of seed words representing a semantic category. They then look for other words that occur in the same patterns as those seeds. MetaBoot does this by first using syntactic templates to generate thousands of potential pattern candidates. The patterns are ranked by the number of seed words they match. At each iteration, the best pattern is selected and ''all'' words that it matches are added as new seeds. This process is iterated to generate a very large number of new seed words.
 +
 
 +
[[File:Metaboot.png]]
 +
 
 +
The second phase, which gives the algorithm its name, is essentially a reranking step. Given a list of words iteratively extracted from patterns, each word is scored based on how many patterns extracted that word. Based on this second heuristic, the top N (in this case, 5) words are added to the overall seed word list, expanding that list. This whole process, iterating on both levels, then continues until the subjectivity word lists have grown substantially.
 +
 
 +
=== Basilisk ===
 +
 
 +
Basilisk modifies the MetaBootstrapping algorithm by not assuming that a single semantic class is being extracted for a single pattern. Instead, in the second phase of candidate evaluation, words are ranked based on both the number of patterns which extracted those words, as well as the association of those words with the existing seed word lists. This set of collective information about a candidate is more robust than counts of patterns alone.
 +
 
 +
An example of the output of these systems is shown below for a number of patterns:
 +
 
 +
[[File:Patterns.png]]
  
 
== Evaluation ==
 
== Evaluation ==
 +
 +
First, the two systems were evaluated based on the number of words they extracted and the accuracy of those words (essentially, a precision/recall tradeoff, where recall is measured by an unbounded number of iterations rather than out of a well-defined total correct set).
 +
 +
[[File:Basilisk_metaboot.png]]
 +
 +
In this example it was shown that precision continues to stay high for many more iterations using Basilisk compared to MetaBoot.
 +
 +
A simple subjectivity classifier was used to evaluate the efficacy of these patterns. After a word list was compiled and manually validated, a new set of features based on these words was created. Two classes of subjective words, weak and strong, were defined. Then four features were defined, one for each class of subjective words for each bootstrapping algorithm. Given a sentence to classify as subjective, the value of a feature marked whether 0, 1, or 2+ words from that feature's corresponding list occurred. These four features (marked below as SubjNoun) were added to the state-of-the-art set of features (marked in the table below as WBO). The additional features in the top row were from non-structured sources inspired by related work.
 +
 +
[[File:SubjNoun.png]]
 +
 +
This shows that adding features based on the bootstrapped word lists improves performance primarily by increasing precision, rather than increasing recall. It suggests that the benefit of bootstrapping in this instance was not by adding new, more creative forms of subjectivity, but by confirming that found examples from the prior system were indeed subjective.

Latest revision as of 08:54, 30 November 2011

Citation

Ellen Riloff, Janyce Wiebe, and Theresa Wilson. Learning Subjective Nouns using Extraction Pattern Bootstrapping. In the International Conference on Natural Language Learning. 2003.

Summary

Subjectivity classification is a task which shows large jumps in performance based on keyword lists. Since so much of the information for this particular task is encoded at the lexical level, it is particularly useful as a domain for testing the efficacy of dictionary-building bootstrapping systems. In this paper, two different systems for bootstrapping, MetaBoot and Basilisk, are compared for accuracy and comprehensiveness. The resulting wordlist, manually validated from the outputs of both bootstrapping systems, is then used to create new features for a subjectivity classifier. It is shown that these simple features, using the wordlists as input, improves performance over the previous state-of-the-art feature space for this task.

Extraction Patterns

MetaBootstrapping

Both algorithms presented in this paper begin with a set of seed words representing a semantic category. They then look for other words that occur in the same patterns as those seeds. MetaBoot does this by first using syntactic templates to generate thousands of potential pattern candidates. The patterns are ranked by the number of seed words they match. At each iteration, the best pattern is selected and all words that it matches are added as new seeds. This process is iterated to generate a very large number of new seed words.

Metaboot.png

The second phase, which gives the algorithm its name, is essentially a reranking step. Given a list of words iteratively extracted from patterns, each word is scored based on how many patterns extracted that word. Based on this second heuristic, the top N (in this case, 5) words are added to the overall seed word list, expanding that list. This whole process, iterating on both levels, then continues until the subjectivity word lists have grown substantially.

Basilisk

Basilisk modifies the MetaBootstrapping algorithm by not assuming that a single semantic class is being extracted for a single pattern. Instead, in the second phase of candidate evaluation, words are ranked based on both the number of patterns which extracted those words, as well as the association of those words with the existing seed word lists. This set of collective information about a candidate is more robust than counts of patterns alone.

An example of the output of these systems is shown below for a number of patterns:

Patterns.png

Evaluation

First, the two systems were evaluated based on the number of words they extracted and the accuracy of those words (essentially, a precision/recall tradeoff, where recall is measured by an unbounded number of iterations rather than out of a well-defined total correct set).

Basilisk metaboot.png

In this example it was shown that precision continues to stay high for many more iterations using Basilisk compared to MetaBoot.

A simple subjectivity classifier was used to evaluate the efficacy of these patterns. After a word list was compiled and manually validated, a new set of features based on these words was created. Two classes of subjective words, weak and strong, were defined. Then four features were defined, one for each class of subjective words for each bootstrapping algorithm. Given a sentence to classify as subjective, the value of a feature marked whether 0, 1, or 2+ words from that feature's corresponding list occurred. These four features (marked below as SubjNoun) were added to the state-of-the-art set of features (marked in the table below as WBO). The additional features in the top row were from non-structured sources inspired by related work.

SubjNoun.png

This shows that adding features based on the bootstrapped word lists improves performance primarily by increasing precision, rather than increasing recall. It suggests that the benefit of bootstrapping in this instance was not by adding new, more creative forms of subjectivity, but by confirming that found examples from the prior system were indeed subjective.