The TÜBÍTAK-UEKAE statistical machine translation system for IWSLT 2008.

Coşkun Mermer, Hamza Kaya, Ömer Farukhan Güneş, Mehmet Uğur Doğan


Abstract
We present the TÜBİTAK-UEKAE statistical machine translation system that participated in the IWSLT 2008 evaluation campaign. Our system is based on the open-source phrase-based statistical machine translation software Moses. Additionally, phrase-table augmentation is applied to maximize source language coverage; lexical approximation is applied to replace out-of-vocabulary words with known words prior to decoding; and automatic punctuation insertion is improved. We describe the preprocessing and postprocessing steps and our training and decoding procedures. Results are presented on our participation in the classical Arabic-English and Chinese-English tasks as well as the new Chinese-Spanish direct and Chinese-English-Spanish pivot translation tasks.
Anthology ID:
2008.iwslt-evaluation.20
Volume:
Proceedings of the 5th International Workshop on Spoken Language Translation: Evaluation Campaign
Month:
October 20-21
Year:
2008
Address:
Waikiki, Hawaii
Venue:
IWSLT
SIG:
SIGSLT
Publisher:
Note:
Pages:
138–142
Language:
URL:
https://aclanthology.org/2008.iwslt-evaluation.20
DOI:
Bibkey:
Cite (ACL):
Coşkun Mermer, Hamza Kaya, Ömer Farukhan Güneş, and Mehmet Uğur Doğan. 2008. The TÜBÍTAK-UEKAE statistical machine translation system for IWSLT 2008.. In Proceedings of the 5th International Workshop on Spoken Language Translation: Evaluation Campaign, pages 138–142, Waikiki, Hawaii.
Cite (Informal):
The TÜBÍTAK-UEKAE statistical machine translation system for IWSLT 2008. (Mermer et al., IWSLT 2008)
Copy Citation:
PDF:
https://aclanthology.org/2008.iwslt-evaluation.20.pdf