IBM Model 2

From Cohen Courses
Jump to navigationJump to search

Citation

Brown, P. F., Pietra, V. J. D., Pietra, S. A. D., & Mercer, R. L. (1993). The mathematics of statistical machine translation: parameter estimation. Comput. Linguist., 19, 263–311.

Online version

pdf

Summary

IBM Model 2 is an extension to [IBM Model 1].

This model addressed the weak reordering properties of IBM Model 1 by modeling the absolution distortion between the words in parallel sentence.

Model

One of the problems of the IBM Model 1 is that it is very weak to reordering, since is calculated using only the lexical translation probabilities . Because of this, if the model is presented with 2 translations candidates and with the same lexical translations, but with different reordering of the translated words, the model scores both translations with the same score.

Mixture-based Alignment models~(IBM Model 2) addresses this problem by modeling the absolute distortion in the word positioning between the 2 languages, introducing an alignment probability distribution , where and are the word positions in the source and target sentences. Thus the equation for becomes:

Where the alignment probability distribution models the probability of a word in the position in the source sentence of being reordered into the position in the target sentence.