Nschneid writeup of Poon 2007

From Cohen Courses
Jump to navigationJump to search

This is Nschneid's review of Poon_2007_joint_inference_in_information_extraction

This paper uses a Markov Logic Network to jointly segment and cluster citations. The data set has each citation segmented into Author, Title, and Venue fields, and identifies which citations refer to the same work. Rather than performing segmentation followed by clustering in a pipelined fashion, the authors show that knowledge about coreferent entities helps with segmentation. Their results are improved upon by Singh, Schultz, & McCallum 2009, which instead of an MLN uses a factor graph with MCMC for more efficient inference.

  • Could the MLN approach be expected to scale to larger problems? Segmentation of bibliographic entries (which follow a fairly regular form) is presumably much simpler than, say, word segmentation (or morpheme segmentation for a highly agglutinative language).