Difference between revisions of "Bansal et al, ACL 2011"
Line 20: | Line 20: | ||
== Description of the Method == | == Description of the Method == | ||
− | The word alignment model described in this work is an extension to the work in [[Vogal et al, COLING 1996]], where a word to phrase alignment model was presented. The first extension is made so that each word in the target side can be aligned with more than one source word. | + | The word alignment model described in this work is an extension to the work in [[Vogal et al, COLING 1996]], where a word to phrase alignment model was presented. |
+ | |||
+ | The first extension is made so that each word in the target side can be aligned with more than one source word. This makes the model Semi-Markov, since each state emit more than one source word (observations) at each timestamp. | ||
+ | |||
+ | The second extension allows phrases with gaps on the state side. | ||
== Experimental Results == | == Experimental Results == |
Revision as of 17:08, 21 October 2011
Contents
Note
still incomplete...
Citation
M. Bansal, C. Quirk, and R. Moore. 2011. Gappy phrasal alignment by agreement. In Proceedings of ACL.
Online version
Summary
This work defines a phrase-to-phrase alignment model for Statistical Machine Translation. A model based on HMMs is defined based on the work presented in Vogal et al, COLING 1996, and extending it to allow continuous and discontinuous phrases (gappy phrases).
The quality of the alignments is further improved by employing alignment agreement described in [Liang and al, 2006], where bidirectional alignments are trained with a joint objective function, rather than using Symmetrization.
Experimental results show improvements in terms of AER (Alignment Error Rate) over the work in [Liang and al, 2006]. As for translation quality, it was evaluated using BLEU and showed improvements over the same baseline.
Description of the Method
The word alignment model described in this work is an extension to the work in Vogal et al, COLING 1996, where a word to phrase alignment model was presented.
The first extension is made so that each word in the target side can be aligned with more than one source word. This makes the model Semi-Markov, since each state emit more than one source word (observations) at each timestamp.
The second extension allows phrases with gaps on the state side.
Experimental Results
Related Work
The work in Marcus and Wong, EMNLP 2002, describes a joint probability distribution, which is used and extended in this work.