Difference between revisions of "Attribute Extraction"

Latest revision as of 19:29, 30 November 2010

Summary

Attribute Extraction is a problem in the field of information extraction that focuses on identifying properties/features that describe a named entity. Performing attribute extract is often used in disambiguating person names, extracting encylopedic knowledge, and in improving question answering systems.

Common Approaches

Some approaches to Attribute Extraction include:

Template/Pattern-Learning
- Learn template contextual patterns using seed-based bootstrapping, and assign probability of attribute based on surrounding context. Variations of this method seems to be the predominately used approach in the literature.
Position Based
- Basing predictions on absolute and relative ordering of where the attribute values typically appear in documents.
Transitivity-Based
- Using transitivity of attributes across co-occuring entities. Co-occuring entities, such as people mentioned in a given person's biography page, tend to have similar attributes.
Latent-Based
- Detect attributes that may not directly be mentioned in an article based on a topic-model.
Two-step: Extract then Verify
- First system uses rules, NER, gazetteer based matching, and patterns (manually created or learned) to extract all attribute candidates
- Then verify candidates using a classifier (with features typically based on the context, pattern values, dependency path) to trained determine if attribute value is correct for the given individual or should be discarded
- Sometimes (depending on the application), researchers have opted to filter attribute candidates based on lexical patterns instead of performing classification.

Evaluation

One venue of evaluation for the attribute extraction task has been the Web People Search workshop (WePS: Searching information about entities in the web), which has had a attribute extraction challenge in its past two workshops: WePS-2 Attribute Extraction Subtask Guidelines, WePS-3 Attribute Extraction Subtask Guidelines

@@ Line 1: / Line 1: @@
 == Summary ==
-Attribute Extraction is a [[category::problem]] in the field of information extraction that focuses on identifying properties/features that describe a named entity. Performing attribute extract is often used in disambiguating person names, extracting encylopedic knowledge, and in improving question answering.
+Attribute Extraction is a [[category::problem]] in the field of information extraction that focuses on identifying properties/features that describe a named entity. Performing attribute extract is often used in disambiguating person names, extracting encylopedic knowledge, and in improving question answering systems.
 == Common Approaches ==
@@ Line 7: / Line 7: @@
 Some approaches to Attribute Extraction include:
 * '''Template/Pattern-Learning'''
-** Learn template contextual patterns using seed-based bootstrapping, and assign probability of attribute based on surrounding context. Variations of this method seems to be the predominately used approach.
+** Learn template contextual patterns using seed-based bootstrapping, and assign probability of attribute based on surrounding context. Variations of this method seems to be the predominately used approach in the literature.
 * '''Position Based'''
 ** Basing predictions on absolute and relative ordering of where the attribute values typically appear in documents.
@@ Line 14: / Line 14: @@
 * '''Latent-Based'''
 ** Detect attributes that may not directly be mentioned in an article based on a topic-model.
-* '''Extract then Verify'''
+* '''Two-step: Extract then Verify'''
-** Two step procedure: First system uses rules, NER, gazetteer based matching, and patterns (manually created or learned) to extract all attribute candidates
+** First system uses rules, NER, gazetteer based matching, and patterns (manually created or learned) to extract all attribute candidates
-** Then verify candidates using a classifier (with features based on the context, pattern values, and dependency path) to trained determine if attribute value is correct for the given individual or should be discarded
+** Then verify candidates using a classifier (with features typically based on the context, pattern values, dependency path) to trained determine if attribute value is correct for the given individual or should be discarded
 ** Sometimes (depending on the application), researchers have opted to filter attribute candidates based on lexical patterns instead of performing classification.

Difference between revisions of "Attribute Extraction"

Latest revision as of 19:29, 30 November 2010

Contents

Summary

Common Approaches

Evaluation

Relevant Papers

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools