Difference between revisions of "Apappu writeup on Bellare and McCullum"

Latest revision as of 10:42, 3 September 2010

CRF based approach to align DB records and their corresponding realization in the running text.

word alignment mechanism between input text and database record.

The word alignment model imitates IBM model 1 where each target token could be mapped to more than one source token.

Advantage with a word alignment model is it takes care of discrepancies due to spelling errors, word insertions/deletions and extra fields.

Authors propose two types of feature sets and corresponding CRF for each task, one of them addresses alignment problem and the other one deals with extraction features defined on labels and input text. The difference between alignment CRF and extraction CRF is alignCRF is a 0-order model whereas the extrCRF is a first order one.

Authors found that alignCRF outperforms generative alignment model of IBM Model4 and HMM alignment model. Especially, when they could consider non-independent features relevant to running text a better performance has been expected.

On the other hand, authors showed that their extraction CRF model does better compared to others and previous state of art systems.

Revision as of 13:11, 28 October 2009 (view source) Apappu (talk \| contribs)	Latest revision as of 10:42, 3 September 2010 (view source) WikiAdmin (talk \| contribs) m (1 revision)
(No difference)