Difference between revisions of "Forward-Backward"
Line 69: | Line 69: | ||
<math> | <math> | ||
\beta_N(s_l) = 1 | \beta_N(s_l) = 1 | ||
− | \beta_i(s_l) = \sum_{s_m\in{S}}\phi_i | + | \beta_i(s_l) = \sum_{s_m\in{S}}\phi_i(s_l,s_m)\phi_{i+1}(s_m)\beta_{i+1}(s_m) |
</math> | </math> |
Revision as of 10:44, 29 September 2011
Summary
This is a dynamic programming algorithm, used in Hidden Markov Models to efficiently compute the state posteriors over all the hidden state variables.
These values are then used in Posterior Decoding, which simply chooses the state with the highest posterior marginal for each position in the sequence.
The forward-backward algorithm can be computed in linear time, where as, a brute force algorithm that checks all possible state sequences would be exponential over the length of the sequence.
Problem description
Posterior decoding is one of several ways to find the best hidden state sequence . It consists in picking the highest state posterior for each position in the sequence:
where is the state posterior for position . The state posterior is given by:
where is sequence posterior of all possible state sequences where the position is the state .
The sequence posterior for a given sequence is defined as:
is calculated as the product of all node potentials of the nodes in the sequence, corresponding to the state observation parameters for the state , and the transition potentials of the transition in the sequence, corresponding to the translation parameters. These potentials are estimated during the training process of Hidden Markov Models. As an example the sequence in red in the following figure would be calculated as .
Thus, to calculate all , for instance, we would need to sum the sequence posteriors for the sequences r r r r, s r r r, r r s r, s r s r, r r r s, s r r s, r r s s and s r s s. A brute force algorithm would compute the sequence posteriors separately. This would mean that the number of computations to calculate a single potential would be in the order of .
Forward-backward
The Forward-backward algorithm is an dynamic programming algorithm that can compute \gamma for all states in linear time, by exploring the fact that each state is only dependent on the previous state.
It iterates through each node and transition once to build the forward probabilities and another time to build the backward probabilities .
The forward probability represents the probability that in position , we are in state and that we have observed up to that position. The forward probability is defined by the following recurrence:
where S is the set of possible states for each position.
We can see in the example that the forward probability corresponds to the sequence posteriors for the partial sequences and end in r in position 2 (s s and r s).
The forward probabilities can be used to calculate the posterior over all possible sequences :
where N is the last position of the sequence.
To calculate the state posterior and transition posterior, we also need to calculate the backward probabilities, which operates in the inverse direction: