Difference between revisions of "Attribute Extraction"
From Cohen Courses
Jump to navigationJump to searchPastStudents (talk | contribs) |
PastStudents (talk | contribs) |
||
Line 6: | Line 6: | ||
Some approaches to Attribute Extraction include: | Some approaches to Attribute Extraction include: | ||
− | * '''Template/Pattern-Learning''' | + | * '''Template/Pattern-Learning''' |
− | * ''' | + | ** Learn template contextual patterns using seed-based bootstrapping. |
− | * '''Position Based''' | + | * '''Extract/Verify''' |
− | * '''Transitivity-Based''' | + | ** Two step procedure: First system uses rules and patterns to extract all attribute candidates |
− | * '''Latent-Based''' | + | ** Then verify candidates using a classifier (with features based on the context, pattern values, and dependency path) to determine if attribute value is correct for the given individual or should be discarded |
+ | * '''Position Based''' | ||
+ | ** Basing predictions on absolute and relative ordering of where the attribute values typically appear in documents. | ||
+ | * '''Transitivity-Based''' | ||
+ | ** Using transitivity of attributes across co-occuring entities. Co-occuring entities, such as people mentioned in a given person's biography page, tend to have similar attributes. | ||
+ | * '''Latent-Based''' | ||
+ | ** Detect attributes that may not directly be mentioned in an article based on a topic-model. | ||
* | * | ||
Revision as of 19:14, 30 November 2010
Summary
Attribute Extraction is a problem in the field of information extraction that focuses on identifying properties/features that describe a named entity. Performing attribute extract is often used in disambiguating person names, extracting encylopedic knowledge, and in improving question answering.
Common Approaches
Some approaches to Attribute Extraction include:
- Template/Pattern-Learning
- Learn template contextual patterns using seed-based bootstrapping.
- Extract/Verify
- Two step procedure: First system uses rules and patterns to extract all attribute candidates
- Then verify candidates using a classifier (with features based on the context, pattern values, and dependency path) to determine if attribute value is correct for the given individual or should be discarded
- Position Based
- Basing predictions on absolute and relative ordering of where the attribute values typically appear in documents.
- Transitivity-Based
- Using transitivity of attributes across co-occuring entities. Co-occuring entities, such as people mentioned in a given person's biography page, tend to have similar attributes.
- Latent-Based
- Detect attributes that may not directly be mentioned in an article based on a topic-model.
Evaluation
One venue of evaluation for the attribute extraction task has been the Web People Search workshop (WePS: Searching information about entities in the web), which has had a attribute extraction challenge in its past two workshops: WePS-2 Attribute Extraction Subtask Guidelines, WePS-3 Attribute Extraction Subtask Guidelines