Difference between revisions of "Comparative Study of Discriminative Models in SMT"

From Cohen Courses
Jump to navigationJump to search
Line 6: Line 6:
  
 
=== Minor Differences ===
 
=== Minor Differences ===
** The baseline models used in these 2 papers are different statistical machine translations models. The work in A uses phrase-based models proposed in [[Koehn_et_al,_ACL_2003]], while the work in B uses [[Hierarchical_phrase-based_translation]]. In terms of translation quality, hierarchical models tend to work better with language pairs with strong reorderings, such as Chinese to English.
+
* The baseline models used in these 2 papers are different statistical machine translations models. The work in A uses phrase-based models proposed in [[Koehn_et_al,_ACL_2003]], while the work in B uses [[Hierarchical_phrase-based_translation]]. In terms of translation quality, hierarchical models tend to work better with language pairs with strong reorderings, such as Chinese to English.
 +
* In terms of datasets, both use the [[EUROPARL]] corpora, but choose different language pairs, training, held-out and test sets. This makes a quantitative comparison of the model results unreliable.

Revision as of 23:28, 5 November 2012

Summary

This page compares and contrasts two discriminative methods for Machine Translation that have been proposed in An_End-to-End_Discriminative_Approach_to_Machine_Translation and A_Discriminative_Latent_Variable_Model_for_SMT. The main different between these methods is in the approach taken for building the translation model. In the former case, a vector of features is trained using parallel data in order to maximize the likelihood of the data, and weight vector is trained using a weighted perceptron method on a separate phase. On the other hand, the latter work employs a log-linear model, where the feature set and the weights are trained jointly in order to maximize the likelihood of the data.

We will call the work in An_End-to-End_Discriminative_Approach_to_Machine_Translation, "A" and the work in A_Discriminative_Latent_Variable_Model_for_SMT "B".

Minor Differences

  • The baseline models used in these 2 papers are different statistical machine translations models. The work in A uses phrase-based models proposed in Koehn_et_al,_ACL_2003, while the work in B uses Hierarchical_phrase-based_translation. In terms of translation quality, hierarchical models tend to work better with language pairs with strong reorderings, such as Chinese to English.
  • In terms of datasets, both use the EUROPARL corpora, but choose different language pairs, training, held-out and test sets. This makes a quantitative comparison of the model results unreliable.