Martin Ratajczak
2010
A Comparison of Various Types of Extended Lexicon Models for Statistical Machine Translation
Matthias Huck
|
Martin Ratajczak
|
Patrick Lehnen
|
Hermann Ney
Proceedings of the 9th Conference of the Association for Machine Translation in the Americas: Research Papers
In this work we give a detailed comparison of the impact of the integration of discriminative and trigger-based lexicon models in state-of-the-art hierarchical and conventional phrase-based statistical machine translation systems. As both types of extended lexicon models can grow very large, we apply certain restrictions to discard some of the less useful information. We show how these restrictions facilitate the training of the extended lexicon models. We finally evaluate systems that incorporate both types of models with different restrictions on a large-scale translation task for the Arabic-English language pair. Our results suggest that extended lexicon models can be substantially reduced in size while still giving clear improvements in translation performance.