Expectation Maximization
Expectation Maximization (EM) is an iterative method for finding the maximum likelihood estimates of the unknown parameters, , of a probabilistic model with latent variables, , given observed data, . The estimates are found by iteratively alternating between two steps: (1) the Expectation (E)-Step, in which the posterior distribution of the latent variables is calculated, using a conditional expectation given the data, and the current estimate of the model parameters, and (2) the Maximization (M)-Step, in which the model parameters are re-estimated using the distribution over .
Literature
Dempster et al., 1997 formalize the EM algorithm and provide a proof of convergence.
An alternative explanation to EM gives the intuition behind alternating between variables in terms of lower bound maximization (Neal and Hinton, 1998, and Minka, 1998). In this formulation, the E-step is regarded as determining a lower bound for the posterior distribution, and the M-step as optimizing this bound, resulting in a beter estimate for the model parameters.
Algorithm
Let be a set of observations resulting for a probabilistic model with unknown parameters, , and hidden variables . We wish to recover a set of parameters, that will maximize the likelihood function:
We assume that directly optimizing is difficult and that considering the full data is easier.