Difference between revisions of "Attribute Extraction"

Revision as of 19:20, 30 November 2010

Summary

Attribute Extraction is a problem in the field of information extraction that focuses on identifying properties/features that describe a named entity. Performing attribute extract is often used in disambiguating person names, extracting encylopedic knowledge, and in improving question answering.

Common Approaches

Some approaches to Attribute Extraction include:

Template/Pattern-Learning
- Learn template contextual patterns using seed-based bootstrapping, and assign probability of attribute based on surrounding context.
Extract/Verify
- Two step procedure: First system uses rules, NER and manually or automatically created patterns to extract all attribute candidates
- Then verify candidates using a classifier (with features based on the context, pattern values, and dependency path) to trained determine if attribute value is correct for the given individual or should be discarded
Position Based
- Basing predictions on absolute and relative ordering of where the attribute values typically appear in documents.
Transitivity-Based
- Using transitivity of attributes across co-occuring entities. Co-occuring entities, such as people mentioned in a given person's biography page, tend to have similar attributes.
Latent-Based
- Detect attributes that may not directly be mentioned in an article based on a topic-model.

Evaluation

One venue of evaluation for the attribute extraction task has been the Web People Search workshop (WePS: Searching information about entities in the web), which has had a attribute extraction challenge in its past two workshops: WePS-2 Attribute Extraction Subtask Guidelines, WePS-3 Attribute Extraction Subtask Guidelines

@@ Line 7: / Line 7: @@
 Some approaches to Attribute Extraction include:
 * '''Template/Pattern-Learning'''
-** Learn template contextual patterns using seed-based bootstrapping.
+** Learn template contextual patterns using seed-based bootstrapping, and assign probability of attribute based on surrounding context.
 * '''Extract/Verify'''
-** Two step procedure: First system uses rules and patterns to extract all attribute candidates
+** Two step procedure: First system uses rules, NER and manually or automatically created patterns to extract all attribute candidates
-** Then verify candidates using a classifier (with features based on the context, pattern values, and dependency path) to determine if attribute value is correct for the given individual or should be discarded
+** Then verify candidates using a classifier (with features based on the context, pattern values, and dependency path) to trained determine if attribute value is correct for the given individual or should be discarded
 * '''Position Based'''
 ** Basing predictions on absolute and relative ordering of where the attribute values typically appear in documents.

Difference between revisions of "Attribute Extraction"

Revision as of 19:20, 30 November 2010

Contents

Summary

Common Approaches

Evaluation

Relevant Papers

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools