Difference between revisions of "Hopkins and May, EMNLP 2011. Tuning as Ranking"
(→Method) |
(→Method) |
||
Line 9: | Line 9: | ||
== Method == | == Method == | ||
− | Although, [[UsesMethod::Minimum error rate training | MERT]] is well-understood, easy to implement, and runs quickly, it can behave erratically and would not scale beyond a handful of features. This is a major bottleneck towards working with richer feature representations and structure. | + | Although, [[UsesMethod::Minimum error rate training | MERT]] is well-understood, easy to implement, and runs quickly, it can behave erratically and would not scale beyond a handful of features. This is a major bottleneck towards working with richer feature representations and structure. |
Hence, the authors propose a simpler approach than [[UsesMethod::Margin Infused Relaxed Algorith, | MIRA]] to tuning that similarly scales to high-dimensional feature spaces. Tuning is treated as a ranking problem ([[RelatedPaper::Chen et al., 2009]]), where the explicit goal is to learn to correctly rank candidate translations. The authors describe a pairwise approach to ranking, in which the ranking problem is reduced to the binary classification task of deciding between candidate translation pairs. | Hence, the authors propose a simpler approach than [[UsesMethod::Margin Infused Relaxed Algorith, | MIRA]] to tuning that similarly scales to high-dimensional feature spaces. Tuning is treated as a ranking problem ([[RelatedPaper::Chen et al., 2009]]), where the explicit goal is to learn to correctly rank candidate translations. The authors describe a pairwise approach to ranking, in which the ranking problem is reduced to the binary classification task of deciding between candidate translation pairs. |
Revision as of 18:01, 29 November 2011
Contents
Citation
Mark Hopkins and Jonathan May. 2011. Tuning as Ranking. In Proceedings of EMNLP-2011.
Online Version
Summary
This paper presents a simple and scalable method for statistical machine translation parameter tuning based on the pairwise approach to ranking. This pairwise ranking optimization (PRO) method has advantages over MERT Och, 2003 as it is not limited to a handful of parameters and can easily handle systems with thousands of features. In addition, unlike recent approaches built upon the MIRA algorithm of Crammer and Singer, 2003 (Watanabe et al., 2007), PRO is easy to implement.
Method
Although, MERT is well-understood, easy to implement, and runs quickly, it can behave erratically and would not scale beyond a handful of features. This is a major bottleneck towards working with richer feature representations and structure.
Hence, the authors propose a simpler approach than MIRA to tuning that similarly scales to high-dimensional feature spaces. Tuning is treated as a ranking problem (Chen et al., 2009), where the explicit goal is to learn to correctly rank candidate translations. The authors describe a pairwise approach to ranking, in which the ranking problem is reduced to the binary classification task of deciding between candidate translation pairs.