Bbd writeup for C. Sutton et. al.

From Cohen Courses
Jump to navigationJump to search

This is a review of Sutton_2004_collective_segmentation_and_labeling_of_distant_entities_in_information_extraction by user:bbd.

This paper proposes a skip-chain CRF which is basically extension of linear chain CRF by adding edges between occurances of similar entities, called skip edges. They are addressing label consistency problem within a single document and report accuracy better than other existing techniques relational markov networks. By using probabilistic sequence model like CRF they are doing segmentation an labelling simultaneously leveraging dependency between 2 tasks. Also using conditional model vs generative model like HMM gives advantage of flexibility to add features depending on input x.

I liked their claim of using skip edge features to improve model confidence on some labels. Since skip edges have information abt context on both sides, if model is confident abt prediction at one end that may help to label other end if model is not confident there.

They use loopy belief propagation to learn parameters, since the model is complex that may result in large training time. I liked the 2 stage model by Vijay Krishnan more than this since its simple and training time is small.