Generalized Expectation Criteria

From Cohen Courses
Revision as of 13:56, 2 November 2011 by Daegunw (talk | contribs)
Jump to navigationJump to search

Summary

This can be viewed as a parameter estimation method that can augment/replace traditional parameter estimation methods such as maximum likelihood estimation. M

Support Vector Machines or Conditional Random Fields to efficiently optimize the objective function, especially in the online setting. Stochastic optimizations like this method are known to be faster when trained with large, redundant data sets.

Expectation

Let be some set of variables and their assignments be . Let be the parameters of a model that defines a probability distribution . The expectation of a function according to the model is

We can partition the variables into "input" variables and "output" variables that is conditioned on the input variables. When the assignment of the input variables are provided, the conditional expectation is

Generalized Expectation

A generalized expectation (GE) criteria is a function G that takes the model's expectation of as an argument and returns a scalar. The criteria is then added as a term in the parameter estimation objective function.

Or can be defined based on a distance to a target value for . Let be the target value and be some distance function, then we can define in the following way:

Use Cases

Application to semi-supervised learning

Mann and McCallum, ICML 2007 describes an application of GE to a semi-supervised learning problem. The GE terms used here indicates a preference/prior about the marginal class distribution, that is either directly provided by human expert or estimated from labeled data.

Let be the target distribution over class labels and ( denotes the vector indicator function on labels ). Since the expectation of is the model's predicted distribution over labels, we can define a simple GE term as a negative KL-divergence between the predicted distribution and the target distribution


Related Papers