Maximum Entropy model

From Cohen Courses
Jump to navigationJump to search

This is a method.


The principle was first expounded by E.T. Jaynes in two papers in 1957 where he emphasized a natural correspondence between statistical mechanics and information theory. In particular, Jaynes offered a new and very general rationale why the Gibbsian method of statistical mechanics works. He argued that the entropy of statistical mechanics and the information entropy of information theory are principally the same thing. Consequently, statistical mechanics should be seen just as a particular application of a general tool of logical inference and information theory.


In most practical cases, the stated prior data or testable information is given by a set of conserved quantities (average values of some moment functions), associated with the probability distribution in question. This is the way the maximum entropy principle is most often used in statistical thermodynamics. Another possibility is to prescribe some symmetries of the probability distribution. An equivalence between the conserved quantities and corresponding symmetry groups implies the same level of equivalence for both these two ways of specifying the testable information in the maximum entropy method.

The maximum entropy principle is also needed to guarantee the uniqueness and consistency of probability assignments obtained by different methods, statistical mechanics and logical inference in particular. Strictly speaking, the trial distributions, which do not maximize the entropy, are actually not probability distributions.

The maximum entropy principle makes explicit our freedom in using different forms of prior data. As a special case, a uniform prior probability density (Laplace's principle of indifference) may be adopted. Thus, the maximum entropy principle is not just an alternative to the methods of inference of classical statistics, but it is an important conceptual generalization of those methods.

In ordinary language, the principle of maximum entropy can be said to express a claim of epistemic modesty, or of maximum ignorance. The selected distribution is the one that makes the least claim to being informed beyond the stated prior data, that is to say the one that admits the most ignorance beyond the stated prior data.

Relevant Papers