Liuy writeup of Bunescu 2005
This is a review of Bunescu_2005_a_shortest_path_dependency_kernel_for_relation_extraction by user:Liuy.
The paper tries to achieve improvements on extracting top-level relations, by using a new kernel computed based on in the dependency graph. The kernel they propose is explore the shortest-path that is possible between any two entities in the graph. A difficulty is that methods that dependent on paths in the graph is also constrained by the sparsity of the data. The paper resolves this by classifying words by their level of generality (for example, noun, active verb or passive verb). In this way, the method can use both the path and the class information. Consequently, subsequences of nodes along the path, if sparse, can be stilled by used as features. The paper puts their level of generalizations in the positions of the rest of the nodes.
I have the following concerns on the work : First, the assumption that the shortest path between the two entities can somehow represent their relationship, is not well justified. Given a sentence dependency graph, it is hard to prove that its contribution to the relationship between two entities, will necessarily concentrate in the shortest path between them. The argument given the paper to explain this is still at a heuristic level, and is not convincing.
Second, Although experiments on the ACE corpus shows their advantage compared with the dependency tree kernel, it is lack of insightful analysis on why this happens. I am not convinced by the positive results on a single dataset.