KeisukeKamataki writeup of Bunescu 2006
This is a review of Bunescu_2006_subsequence_kernels_for_relation_extraction by user:KeisukeKamataki.
Summary: They tried semantic relation extraction of protein interaction and top-level relations from newspaper corpora with subsequent kernels. The key idea of the kernels is that they define frequent patterns of relation/interaction expression into sub-kernels ("Fore-between" type, "Between" type, and "Between-After") type. They alleviated word sparsity problem by categorizing words into classes such as POS tags. The kernel was used as a component of SVM. Their classification result outperformed the previous best kernel K4 (a sum between a bag-of-words kernel and the dependency kernel) consistently.
I like: This paper is clear about how to design/use kernels for relation extraction. It could be all the better if they could do some analysis about how each sub-kernel contributes to boost performance.