Difference between revisions of "Attribute Extraction"

From Cohen Courses
Jump to navigationJump to search
Line 6: Line 6:
  
 
Some approaches to Attribute Extraction include:
 
Some approaches to Attribute Extraction include:
* '''Template/Pattern-Learning''': Learn template contextual patterns using seed-based bootstrapping. Variations of this method are generally the most used approaches found in literature.
+
* '''Template/Pattern-Learning'''
* '''Rule-based''':
+
** Learn template contextual patterns using seed-based bootstrapping.
* '''Position Based''': Basing predictions on absolute and relative ordering of where the attribute values typically appear in documents.
+
* '''Extract/Verify'''
* '''Transitivity-Based''': Using transitivity of attributes across co-occuring entities. Co-occuring entities, such as people mentioned in a given person's biography page, tend to have similar attributes.
+
** Two step procedure: First system uses rules and patterns to extract all attribute candidates
* '''Latent-Based''': Detect attributes that may not directly be mentioned in an article based on a topic-model.
+
** Then verify candidates using a classifier (with features based on the context, pattern values, and dependency path) to determine if attribute value is correct for the given individual or should be discarded
 +
* '''Position Based'''
 +
** Basing predictions on absolute and relative ordering of where the attribute values typically appear in documents.
 +
* '''Transitivity-Based'''
 +
** Using transitivity of attributes across co-occuring entities. Co-occuring entities, such as people mentioned in a given person's biography page, tend to have similar attributes.
 +
* '''Latent-Based'''
 +
** Detect attributes that may not directly be mentioned in an article based on a topic-model.
 
*
 
*
  

Revision as of 19:14, 30 November 2010

Summary

Attribute Extraction is a problem in the field of information extraction that focuses on identifying properties/features that describe a named entity. Performing attribute extract is often used in disambiguating person names, extracting encylopedic knowledge, and in improving question answering.

Common Approaches

Some approaches to Attribute Extraction include:

  • Template/Pattern-Learning
    • Learn template contextual patterns using seed-based bootstrapping.
  • Extract/Verify
    • Two step procedure: First system uses rules and patterns to extract all attribute candidates
    • Then verify candidates using a classifier (with features based on the context, pattern values, and dependency path) to determine if attribute value is correct for the given individual or should be discarded
  • Position Based
    • Basing predictions on absolute and relative ordering of where the attribute values typically appear in documents.
  • Transitivity-Based
    • Using transitivity of attributes across co-occuring entities. Co-occuring entities, such as people mentioned in a given person's biography page, tend to have similar attributes.
  • Latent-Based
    • Detect attributes that may not directly be mentioned in an article based on a topic-model.

Evaluation

One venue of evaluation for the attribute extraction task has been the Web People Search workshop (WePS: Searching information about entities in the web), which has had a attribute extraction challenge in its past two workshops: WePS-2 Attribute Extraction Subtask Guidelines, WePS-3 Attribute Extraction Subtask Guidelines

Relevant Papers