The CMU syntax-augmented machine translation system: SAMT on Hadoop with n-best alignments.

Andreas Zollmann, Ashish Venugopal, Stephan Vogel


Abstract
We present the CMU Syntax Augmented Machine Translation System that was used in the IWSLT-08 evaluation campaign. We participated in the Full-BTEC data track for Chinese-English translation, focusing on transcript translation. For this year’s evaluation, we ported the Syntax Augmented MT toolkit [1] to the Hadoop MapReduce [2] parallel processing architecture, allowing us to efficiently run experiments evaluating a novel “wider pipelines” approach to integrate evidence from N -best alignments into our translation models. We describe each step of the MapReduce pipeline as it is implemented in the open-source SAMT toolkit, and show improvements in translation quality by using N-best alignments in both hierarchical and syntax augmented translation systems.
Anthology ID:
2008.iwslt-evaluation.2
Volume:
Proceedings of the 5th International Workshop on Spoken Language Translation: Evaluation Campaign
Month:
October 20-21
Year:
2008
Address:
Waikiki, Hawaii
Venue:
IWSLT
SIG:
SIGSLT
Publisher:
Note:
Pages:
18–25
Language:
URL:
https://aclanthology.org/2008.iwslt-evaluation.2
DOI:
Bibkey:
Cite (ACL):
Andreas Zollmann, Ashish Venugopal, and Stephan Vogel. 2008. The CMU syntax-augmented machine translation system: SAMT on Hadoop with n-best alignments.. In Proceedings of the 5th International Workshop on Spoken Language Translation: Evaluation Campaign, pages 18–25, Waikiki, Hawaii.
Cite (Informal):
The CMU syntax-augmented machine translation system: SAMT on Hadoop with n-best alignments. (Zollmann et al., IWSLT 2008)
Copy Citation:
PDF:
https://aclanthology.org/2008.iwslt-evaluation.2.pdf