Difference between revisions of "Entropy Minimization for Semi-supervised Learning"
From Cohen Courses
Jump to navigationJump to searchPastStudents (talk | contribs) |
PastStudents (talk | contribs) |
||
Line 1: | Line 1: | ||
+ | This is a method introduced in [http://www.eprints.pascal-network.org/archive/00001978/01/grandvalet05.pdf Y. Grandvalet]. | ||
Minimum entropy regularization can be applied to any model of posterior distribution. | Minimum entropy regularization can be applied to any model of posterior distribution. | ||
+ | For this technique, one assumption for unlabeled examples to be informative is that | ||
+ | classes are well apart, separated by a low density area. | ||
The learning set is denoted <math> \mathcal{L}_{n} = \{X^{(i)}, Z^{(i)}\}^{n}_{i=1} </math>, | The learning set is denoted <math> \mathcal{L}_{n} = \{X^{(i)}, Z^{(i)}\}^{n}_{i=1} </math>, |
Revision as of 21:06, 8 October 2010
This is a method introduced in Y. Grandvalet. Minimum entropy regularization can be applied to any model of posterior distribution. For this technique, one assumption for unlabeled examples to be informative is that classes are well apart, separated by a low density area.
The learning set is denoted , where : If is labeled as , then and for ; if is unlabeled, then for .
The conditional entropy of class labels conditioned on the observed variables:
Assuming that labels are missing at random, we have that
The posterior distribution is defined as the conditional log likelihood and an entropy-regularized term: