Difference between revisions of "Attribute Extraction"

From Cohen Courses
Jump to navigationJump to search
Line 8: Line 8:
 
* '''Template/Pattern-Learning'''
 
* '''Template/Pattern-Learning'''
 
** Learn template contextual patterns using seed-based bootstrapping, and assign probability of attribute based on surrounding context.
 
** Learn template contextual patterns using seed-based bootstrapping, and assign probability of attribute based on surrounding context.
* '''Extract/Verify'''
+
* '''Extract then Verify'''
 
** Two step procedure: First system uses rules, NER and manually or automatically created patterns to extract all attribute candidates
 
** Two step procedure: First system uses rules, NER and manually or automatically created patterns to extract all attribute candidates
 
** Then verify candidates using a classifier (with features based on the context, pattern values, and dependency path) to trained determine if attribute value is correct for the given individual or should be discarded
 
** Then verify candidates using a classifier (with features based on the context, pattern values, and dependency path) to trained determine if attribute value is correct for the given individual or should be discarded

Revision as of 19:20, 30 November 2010

Summary

Attribute Extraction is a problem in the field of information extraction that focuses on identifying properties/features that describe a named entity. Performing attribute extract is often used in disambiguating person names, extracting encylopedic knowledge, and in improving question answering.

Common Approaches

Some approaches to Attribute Extraction include:

  • Template/Pattern-Learning
    • Learn template contextual patterns using seed-based bootstrapping, and assign probability of attribute based on surrounding context.
  • Extract then Verify
    • Two step procedure: First system uses rules, NER and manually or automatically created patterns to extract all attribute candidates
    • Then verify candidates using a classifier (with features based on the context, pattern values, and dependency path) to trained determine if attribute value is correct for the given individual or should be discarded
  • Position Based
    • Basing predictions on absolute and relative ordering of where the attribute values typically appear in documents.
  • Transitivity-Based
    • Using transitivity of attributes across co-occuring entities. Co-occuring entities, such as people mentioned in a given person's biography page, tend to have similar attributes.
  • Latent-Based
    • Detect attributes that may not directly be mentioned in an article based on a topic-model.

Evaluation

One venue of evaluation for the attribute extraction task has been the Web People Search workshop (WePS: Searching information about entities in the web), which has had a attribute extraction challenge in its past two workshops: WePS-2 Attribute Extraction Subtask Guidelines, WePS-3 Attribute Extraction Subtask Guidelines

Relevant Papers