Bansal et al, ACL 2011
Contents
Note
still incomplete...
Citation
M. Bansal, C. Quirk, and R. Moore. 2011. Gappy phrasal alignment by agreement. In Proceedings of ACL.
Online version
Summary
This work defines a phrase-to-phrase alignment model for Statistical Machine Translation. A model based on HMMs is defined based on the work presented in Vogal et al, COLING 1996, and extending it to allow continuous and discontinuous phrases (gappy phrases).
The quality of the alignments is further improved by employing alignment agreement described in [Liang and al, 2006], where bidirectional alignments are trained with a joint objective function, rather than using Symmetrization.
Experimental results show improvements in terms of AER (Alignment Error Rate) over the work in [Liang and al, 2006]. As for translation quality, it was evaluated using BLEU and showed improvements over the same baseline.
Description of the Method
The word alignment model described in this work is an extension to the work in Vogal et al, COLING 1996, where a word to phrase alignment model was presented. Two extensions to this model are proposed.
The first extension is to allow phrasal alignments, where that each word in the target side can be aligned with more than one source word. This makes the model Semi-Markov, since each state emit more than one source word (observations) at each timestamp, meaning that each target word can be aligned with multiple source words, as opposed to the previous work using regular HMM, where each target word can be aligned with at most one source word.
The second extension allows alignments using phrases with gaps to be modeled, where a phrase with a gap is defined as Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle "w\_s \* w\_f"} , where is the starting word and is the final word and "*" can be any number of words. Furthermore, the alignment agreement word presented in [Liang and al, 2006] was employed and extended to the new space of alignments (alignments including gappy phrases).
Experimental Results
Related Work
The work in Marcus and Wong, EMNLP 2002, describes a joint probability distribution, which is used and extended in this work.