Wka writeup of Bellare and McCallum 2009

From Cohen Courses

Jump to navigation Jump to search

This is a review of bellare_2009_generalized_expectation_criteria_for_bootstrapping_extractors_using_record_text_alignment by user:wka.

Use data in DB to annotate text. CRF aligns tokens in DB with their occurrences in text. Resulting annotation used to train extractor.

CRF

Feature vector:
- alignment features: on source-target tokens
- extraction features: on source labels and target text

L-BFGS optimization, non-convex, but local optima are fine. Convex for ExtrCRF.

ExtrCRF (first-order model) as accurate as AlignCRF (zero-order model) without access to DB data.

Error reduction over previous state-of-the-art by 31%

Retrieved from "http://curtis.ml.cmu.edu/w/courses/index.php?title=Wka_writeup_of_Bellare_and_McCallum_2009&oldid=999"

Navigation menu