|
|
Line 19: |
Line 19: |
| </math> | | </math> |
| | | |
− | The posterior distribution is defined as | + | The posterior distribution is defined as the conditional log likelihood and an entropy-regularized term: |
| | | |
| <math> | | <math> |
| \begin{alignat}{2} | | \begin{alignat}{2} |
| C(\boldsymbol{\theta}, \lambda; \mathcal{L}_{n}) & = L(\boldsymbol{\theta}; \mathcal{L}_{n}) - \lambda H(Y|X,Z; \mathcal{L}_{n}) \\ | | C(\boldsymbol{\theta}, \lambda; \mathcal{L}_{n}) & = L(\boldsymbol{\theta}; \mathcal{L}_{n}) - \lambda H(Y|X,Z; \mathcal{L}_{n}) \\ |
− | & = \sum^{n}_{i=1} \text{log}(\sum^{K}_{k=1} Z_{ik}P(Y^{i}=w_{k}|X^{i})) + \lambda \sum^{n}_{i=1} \sum_{k=1}^{K} P(Y^{i}=w_{k}|X^{i}, Z^{i}) \text{log} P(Y^{i}=w_{k}|X^{i}, Z^{i}) | + | & = \sum^{n}_{i=1} \text{log}(\sum^{K}_{k=1} Z^{(i)}_{k}P(Y^{(i)}=w_{k}|X^{(i)})) + |
| + | \lambda \sum^{n}_{i=1} \sum_{k=1}^{K} P(Y^{(i)}=w_{k}|X^{(i)}, Z^{(i)}) \text{log} P(Y^{i}=w_{k}|X^{i}, Z^{i}) |
| \end{alignat} | | \end{alignat} |
| | | |
| </math> | | </math> |
Revision as of 20:52, 8 October 2010
Minimum entropy regularization can be applied to any model of posterior distribution.
The learning set is denoted ,
where :
If is labeled as , then
and for ; if is unlabeled,
then for .
The conditional entropy of class labels conditioned on the observed variables:
Assuming that labels are missing at random, we have that
The posterior distribution is defined as the conditional log likelihood and an entropy-regularized term: