Bbd writeup of learning to extract relations from web

From Cohen Courses
Jump to navigationJump to search

This is a review of Bunescu_2007_learning_to_extract_relations_from_the_web_using_minimal_supervision by user:Bbd.


This paper presents a technique of extracting relations given a small number of training examples. They formulate the problem as Multiple Instance Learning (MIL) and extended the existing techniques by using SVMs and string kernels. SVMs help avoiding overfitting by maximally separating positive and negative examples by decision hyperplane. They have experimented with subsequence kernels for this task.

I liked their solution to Type I bias - They factorize the sequence weight by multiplying weight for each word in the sequence. The starting weights are put so that words correlated to any of the 2 arguments are given lower weights.

I didn't exactly understand type II bias for their example.