Difference between revisions of "Marcus and Wong, EMNLP 2002"
(Created page with '== Model == The phrase-to-phrase alignment model presented in this work is built upon the work in ([http://www.isi.edu/~marcu/papers/jointmt2002.pdf Marcus and Wong, 2002]). In …') |
|||
Line 1: | Line 1: | ||
+ | == Citation == | ||
+ | |||
+ | Marcu, D., & Wong, W. (2002). A phrase-based, joint probability model for statistical machine translation. In In Proceedings of EMNLP, pp. 133–139. | ||
+ | |||
+ | == Online version == | ||
+ | |||
+ | http://www.isi.edu/~marcu/papers/jointmt2002.pdf ACM] | ||
+ | |||
+ | == Summary== | ||
+ | This work presents a phrase-to-phrase alignment model for Statistical Machine Translation. | ||
+ | |||
== Model == | == Model == | ||
Revision as of 10:28, 27 September 2011
Contents
Citation
Marcu, D., & Wong, W. (2002). A phrase-based, joint probability model for statistical machine translation. In In Proceedings of EMNLP, pp. 133–139.
Online version
http://www.isi.edu/~marcu/papers/jointmt2002.pdf ACM]
Summary
This work presents a phrase-to-phrase alignment model for Statistical Machine Translation.
Model
The phrase-to-phrase alignment model presented in this work is built upon the work in (Marcus and Wong, 2002). In this work, words are clustered into phrases by a generative process, which constructs an ordered set of phrases in the target language, an ordered set of phrases in the source language and the alignments between phrases , which indicates that the phrase pair with the target and . The process is composed by 2 steps:
- First, the number of components is chosen and each of phrase pairs are generated independently.
- Then, a ordering for the phrases in the source phrases is chosen, and all the source and target phrases are aligned one to one.
The choice of is parametrized using a geometric distribution , with the stop parameter :
Phrase pairs are drawn from an unknown multinomial distribution .
A simple position based distortion model is used, where:
Finally, the joint probability model for aligning sentences consisting of phrase pairs is given by:
In the experiments paramters and were set to 0.1 and 0.85, respectively.