Difference between revisions of "GeneralizedIterativeScaling"
From Cohen Courses
Jump to navigationJump to searchLine 27: | Line 27: | ||
<math> p_i = \mu \prod_{s=1}^{d} \mu_s^{b_{si}} </math> | <math> p_i = \mu \prod_{s=1}^{d} \mu_s^{b_{si}} </math> | ||
− | satisfying (2), then it maximizes the entropy <math>H(p) = - \sum_i p_i log(p_i)</math> | + | satisfying (2), then it maximizes the entropy |
+ | |||
+ | <math>H(p) = - \sum_i p_i log(p_i)</math> |
Revision as of 10:10, 27 September 2011
This is one of the earliest methods used for inference in log-linear models. Though more sophisticated and faster methods have evolved, this method provides an insight in log linear models.
What problem does it address
The objective of this method is to find a probability function of the form
satisfying the constraints
where is an index set; the probability distribution over which has to be determined, is a probability distribution and is a subprobability function (adds to 1 but for any ); is constant.
Existence of a solution
If of form (1) exists satisfying (2), then it minimizes and is unique. Since are constant; it essentially boils down to the following statement.
Maximizing Entropy
If there exists a positive probability function of the form
satisfying (2), then it maximizes the entropy