Difference between revisions of "Jiao et al COLING 2006"

From Cohen Courses
Jump to navigationJump to search
(Created page with '== Citation == == Online Version == == Summary == == Related Papers ==')
 
Line 1: Line 1:
 
== Citation ==  
 
== Citation ==  
 +
 +
Jiao, F., Wang, S., Lee, C.H., Greiner, R., and Schuurmans, D. Semi-supervised conditional random fields for improved sequence segmentation and labeling. Proceedings of the 21st International Conference on Computational Linguistics. (2006) 209-216.
  
 
== Online Version ==
 
== Online Version ==
 +
 +
http://acl.ldc.upenn.edu/P/P06/P06-1027.pdf
  
 
== Summary ==  
 
== Summary ==  
 +
 +
This paper presented a novel approach to using CRFs in a semi-supervised learning setting. HMMs and other generative models easily incorporate unlabeled data using EM, but have difficulty with non-independent features. Semi-supervised discriminative approaches were less well explored. By incorporating extra data, the new technique improves the accuracy over a baseline CRF trained just on labeled data. In tandem, the authors developed an efficient dynamic programming algorithm to calculate a covariance matrix of features, something necessary to calculate the gradient and perform iterative ascent.
 +
 +
The key idea is to minimize the conditional entropy of the unlabeled data, thereby maximizing the certainty of the labellings and reinforcing the supervised labels. Equivalently, this is like maximizing the KL divergence, making two distributions "farther" apart or decreasing their overlap.
 +
 +
The optimization criterion is to maximize the sum of the conditional likelihood of the labeled samples and the negative conditional entropy of the unlabeled examples, along with regularization. This extra entropy term leads to a non-concave optimization function. However, one can still attempt to improve on a fully supervised CRF by using its learned parameter values as the starting point of an L-BFGS algorithm.
 +
 +
An experiment on named entity recognition of gene names resulted in generally much improved recall and F-measures.
  
 
== Related Papers ==
 
== Related Papers ==
 +
 +
This form of minimum entropy regularization was first explored by [[Grandvalet and Bengio 2004]] for a single, unstructured, variable.

Revision as of 01:44, 30 September 2010

Citation

Jiao, F., Wang, S., Lee, C.H., Greiner, R., and Schuurmans, D. Semi-supervised conditional random fields for improved sequence segmentation and labeling. Proceedings of the 21st International Conference on Computational Linguistics. (2006) 209-216.

Online Version

http://acl.ldc.upenn.edu/P/P06/P06-1027.pdf

Summary

This paper presented a novel approach to using CRFs in a semi-supervised learning setting. HMMs and other generative models easily incorporate unlabeled data using EM, but have difficulty with non-independent features. Semi-supervised discriminative approaches were less well explored. By incorporating extra data, the new technique improves the accuracy over a baseline CRF trained just on labeled data. In tandem, the authors developed an efficient dynamic programming algorithm to calculate a covariance matrix of features, something necessary to calculate the gradient and perform iterative ascent.

The key idea is to minimize the conditional entropy of the unlabeled data, thereby maximizing the certainty of the labellings and reinforcing the supervised labels. Equivalently, this is like maximizing the KL divergence, making two distributions "farther" apart or decreasing their overlap.

The optimization criterion is to maximize the sum of the conditional likelihood of the labeled samples and the negative conditional entropy of the unlabeled examples, along with regularization. This extra entropy term leads to a non-concave optimization function. However, one can still attempt to improve on a fully supervised CRF by using its learned parameter values as the starting point of an L-BFGS algorithm.

An experiment on named entity recognition of gene names resulted in generally much improved recall and F-measures.

Related Papers

This form of minimum entropy regularization was first explored by Grandvalet and Bengio 2004 for a single, unstructured, variable.