KeisukeKamataki writeup of Sutton 2004
This is a review of Sutton_2004_collective_segmentation_and_labeling_of_distant_entities_in_information_extraction by user:KeisukeKamataki.
- Summary:
They proposed an algorithm caled skip-chain CRF which extend linear-chain CRF so that it can handle the dependency of hidden states with long distance. In terms of graphical model, this model is pretty simple. It adds some dependent edges from original linera-chain CRFs. The choice of which skip edge to include depends on what similarity metrics we use (such as if the word is same or not, or the string editing distance, and so on). Mathematical model basically consits of a simple combination of linear-chain part and skip-chain part.
For parameter estimation, they use MAP estimation to eastimate lambda from training set. In order to prevent overfitting from the likelihood, they use a spherical Gaussian prior. As for inference, they use approximate inference technique to handle the model complexity problem.
Although there is no big difference in F-measure between linear-chain CRF and skip-chain CRF, they achieved good prediction for speaker detection which tends to appear multiple times in a document. This is especially clear when we focus on the analysis of the number of tokens who are inconsistently mislabeled (error reduction from 30.2 tokens to 4.8 tokens).
- I like:
This paper is clear about how to extend the general probabilistic framework to goal-specific problem. Their evaluation metrics is also good because they are clear about when this algorithm could be useful.