Difference between revisions of "Entropy Minimization for Semi-supervised Learning"

From Cohen Courses
Jump to navigationJump to search
Line 11: Line 11:
 
<math>
 
<math>
 
H(Y|X,Z; L_{n}) = -\frac{1}{n} \sum^{n}_{i=1} \sum^{K}_{k=1} P(Y^{(i)}=w_{k}|X^{(i)}, Z^{(i)})\text{log}P(Y^{(i)}=w_{k}|X^{(i)},Z^{(i)})
 
H(Y|X,Z; L_{n}) = -\frac{1}{n} \sum^{n}_{i=1} \sum^{K}_{k=1} P(Y^{(i)}=w_{k}|X^{(i)}, Z^{(i)})\text{log}P(Y^{(i)}=w_{k}|X^{(i)},Z^{(i)})
 +
</math>
 +
 +
Assuming that labels are missing at random, we have that
 +
 +
<math>
 +
P(Y^{(i)}=w_{k}|X^{(i)}, Z^{(i)}) = \frac{Z^{(i)_{k}}P(Y^{(i)}=w_{k}|X^{(i)})}{\sum^{K}_{k=1} Z^{(i)}_{l} P(Y^{(i)}=w_{k}|X^{(i)})}
 
</math>
 
</math>
  

Revision as of 20:48, 8 October 2010

Minimum entropy regularization can be applied to any model of posterior distribution.

The learning set is denoted , where : If is labeled as , then and for ; if is unlabeled, then for .

The conditional entropy of class labels conditioned on the observed variables:

Assuming that labels are missing at random, we have that

The posterior distribution is defined as