Enhancing Machine Translation of Academic Course Catalogues with Terminological Resources

Randy Scansani, Silvia Bernardini, Adriano Ferraresi, Federico Gaspari, Marcello Soffritti


Abstract
This paper describes an approach to translating course unit descriptions from Italian and German into English, using a phrase-based machine translation (MT) system. The genre is very prominent among those requiring translation by universities in European countries in which English is a non-native language. For each language combination, an in-domain bilingual corpus including course unit and degree program descriptions is used to train an MT engine, whose output is then compared to a baseline engine trained on the Europarl corpus. In a subsequent experiment, a bilingual terminology database is added to the training sets in both engines and its impact on the output quality is evaluated based on BLEU and post-editing score. Results suggest that the use of domain-specific corpora boosts the engines quality for both language combinations, especially for German-English, whereas adding terminological resources does not seem to bring notable benefits.
Anthology ID:
W17-7901
Volume:
Proceedings of the Workshop Human-Informed Translation and Interpreting Technology
Month:
September
Year:
2017
Address:
Varna, Bulgaria
Editors:
Irina Temnikova, Constantin Orasan, Gloria Corpas Pastor, Stephan Vogel
Venue:
RANLP
SIG:
Publisher:
Association for Computational Linguistics, Shoumen, Bulgaria
Note:
Pages:
1–10
Language:
URL:
https://doi.org/10.26615/978-954-452-042-7_001
DOI:
10.26615/978-954-452-042-7_001
Bibkey:
Cite (ACL):
Randy Scansani, Silvia Bernardini, Adriano Ferraresi, Federico Gaspari, and Marcello Soffritti. 2017. Enhancing Machine Translation of Academic Course Catalogues with Terminological Resources. In Proceedings of the Workshop Human-Informed Translation and Interpreting Technology, pages 1–10, Varna, Bulgaria. Association for Computational Linguistics, Shoumen, Bulgaria.
Cite (Informal):
Enhancing Machine Translation of Academic Course Catalogues with Terminological Resources (Scansani et al., RANLP 2017)
Copy Citation:
PDF:
https://doi.org/10.26615/978-954-452-042-7_001