Relation Extraction
This is a technical problem related to one of the term projects in Information Extraction 10-707 in Fall 2010.
Relation extraction broadly speaking refers to the task of relating entities present in a document. This can take on many specific forms, such as labeling the relation between two given entities, or finding all entity pairs that satisfy a relation, or even multiway relation extraction (also called Record Extraction)
History
Relation Extraction has been studied in depth at MUC (MUC-6, MUC-7) and ACE (ACE2, ACE 2004, ACE 2005, ACE 2007, ACE 2008) series of conferences and evaluations. For biomedical literature, the BioCreAtIvE II tasks have also been useful.
Details
The most common type of Relation Extraction is binary (i.e. for two entities), and can take on one of the following specific forms:
- Finding the type of relationship between given entities in text
Although in one sense this problem is easier than entity extraction because we only need to make one prediction instead of a vector of predictions, it is still considered harder because it requires a variety of syntactic and semantic features, both local as well as nonlocal. This problem is solved using one of the following types of methods:
1) Feature based methods
2) Kernel based methods
3) Rule based methods
- For a given entity and relationship, find all entities that satisfy the relationship with the given entity.
- For a given relationship, find all entity pairs that satisfy it.
This problem is tackled by using a seed database of entity pairs to learn extraction patterns, which are then used to create candidate triplets (Entity1, Entity2, Relationsip). Finally, these candidates are pruned.
Related Paper
The Information Extraction Survey by Sunita Sarawagi contains more detail on this problem, and prior work to solve it.