Philgoo Han writeup of Brin
From Cohen Courses
Jump to navigationJump to searchThis is a review of Brin_1999_extracting_patterns_and_relations_from_the_world_wide_web by user:Ironfoot.
- Extracting information from WWW
- Low recall & high precision: reminds optimize the 80% and make endurable the 20%
- Generating patterns from small size seed relation: exponential growth
- Tuples, patterns duality
- Patterns
- 7 feature pattern
- Heuristic for minimizing false positives
- Various segmentation and entity recognition methods may help. All the things covered in recent classes.
- Experiment
- Lower expansion than expected
- Might there be all book registered catalog for quality measure? Result analysis seems to be limited on a too intuitive level.