Philgoo Han writeup of Banko, Cafarella, Soderland, Broadhead and Etzioni

TextRunner: bootstrapping, domain independant, scalable
Self supervised learner
- Parse seed data to find tuple features for Naive Bayes Classifier
  - Any suffering from feature dependancy?
  - Can sufficient feature be found?
  - Small data -> high bias?
Single Pass Extractor
- Most probable POS of each word -> noun phrase chunker(entity found here) -> non-essential phrase elimination(relation found here)
- Classify with classifier above to find trustworthy entity relation tuples
Redundancy based assessor
- Simple redundancy count assessor
Query Processing
- Distributed inverted indexing
Low model complexity
Compared results with KnowItAll

Navigation menu