Difference between revisions of "Contrastive Estimation"
Line 11: | Line 11: | ||
Unlike the above methods, the contrastive estimation approach optimizes: | Unlike the above methods, the contrastive estimation approach optimizes: | ||
− | :<math>\prod_{i} p(X_i = x_i | X_i \in Neighbor(x_i), \vet | + | :<math>\prod_{i} p(X_i = x_i | X_i \in Neighbor(x_i), \vet\theta)</math> |
== The Algorithm == | == The Algorithm == |
Revision as of 16:35, 29 September 2011
This is a method proposed by Smith and Eisner 2005:Contrastive Estimation: Training Log-Linear Models on Unlabeled Data.
The proposed approach deals with the estimation of log-linear models (e.g. Conditional Random Fields) in an unsupervised fashion. The method focuses on the denominator of the log-linear models by exploiting the so called implicit negative evidence in the probability mass.
Motivation
In the Smith and Eisner (2005) paper, the authors have surveyed different estimation techniques (See the Figure above) for probabilistic graphic models. It is clear that for HMMs, people usually optimize the joint likelihood. For log-linear models, various methods were proposed to optimize the conditional probabilities. In addition to this, there are also methods to directly maximize the classification accuracy, the sum of conditional likelihoods, or expected local accuracy. However, none of the above estimation techniques have specifically focused on the implicit negative evidence in the denominator of the standard log-linear model in an unsupervised setting.
Problem Formulation
Unlike the above methods, the contrastive estimation approach optimizes:
- Failed to parse (unknown function "\vet"): {\displaystyle \prod_{i} p(X_i = x_i | X_i \in Neighbor(x_i), \vet\theta)}