The TÜBÍTAK-UEKAE statistical machine translation system for IWSLT 2007

Coşkun Mermer, Hamza Kaya, Mehmet Uğur Doğan


Abstract
We describe the TÜBITAK-UEKAE system that participated in the Arabic-to-English and Japanese-to-English translation tasks of the IWSLT 2007 evaluation campaign. Our system is built on the open-source phrase-based statistical machine translation software Moses. Among available corpora and linguistic resources, only the supplied training data and an Arabic morphological analyzer are used in the system. We present the run-time lexical approximation method to cope with out-of-vocabulary words during decoding. We tested our system under both automatic speech recognition (ASR) and clean transcript (clean) input conditions. Our system was ranked first in both Arabic-to-English and Japanese-to-English tasks under the “clean” condition.
Anthology ID:
2007.iwslt-1.27
Volume:
Proceedings of the Fourth International Workshop on Spoken Language Translation
Month:
October 15-16
Year:
2007
Address:
Trento, Italy
Venue:
IWSLT
SIG:
SIGSLT
Publisher:
Note:
Pages:
Language:
URL:
https://aclanthology.org/2007.iwslt-1.27
DOI:
Bibkey:
Cite (ACL):
Coşkun Mermer, Hamza Kaya, and Mehmet Uğur Doğan. 2007. The TÜBÍTAK-UEKAE statistical machine translation system for IWSLT 2007. In Proceedings of the Fourth International Workshop on Spoken Language Translation, Trento, Italy.
Cite (Informal):
The TÜBÍTAK-UEKAE statistical machine translation system for IWSLT 2007 (Mermer et al., IWSLT 2007)
Copy Citation:
PDF:
https://aclanthology.org/2007.iwslt-1.27.pdf