Nlao writeup of Brin 1999

From Cohen Courses
Jump to navigationJump to search

This is a review of Brin_1999_extracting_patterns_and_relations_from_the_world_wide_web by user:Nlao.

DIPRE is an early (probably seminal) work of extraction by bootstraping. The "instance-pattern duality" priciple can be traced by to long time ago

- Harris’ Distributional Hypothesis (DH) (Harris, 1964)  “Words that tend to occur in the same contexts tend to have similar meanings”. 
- Robison’s Point-wise Assertion Patterns (PAP) (Robison, 1970) “w1 is in a relation r with w2 if context pattern r(w1, w2) is observed”

Here are some later works that utilize this priciple

- UNICON (Lin and Pantel 2001a)
- DIRT (Lin and Pantel 2001b)
- VerbOCEAN (Chklovski & Pantel, 2004)
- TEASE (Szepktor et al., 2004) (Zanzotto et al 2006)
- Espresso (Pantel & Pennacchiotti 2006)
- TextRunner (Banko et al. 2007)

[minor points]

- As an early work, its implementation is not efficient. No index is used to help pattern matching in large corpus.