Marcello Soffritti
2017
Enhancing Machine Translation of Academic Course Catalogues with Terminological Resources
Randy Scansani
|
Silvia Bernardini
|
Adriano Ferraresi
|
Federico Gaspari
|
Marcello Soffritti
Proceedings of the Workshop Human-Informed Translation and Interpreting Technology
This paper describes an approach to translating course unit descriptions from Italian and German into English, using a phrase-based machine translation (MT) system. The genre is very prominent among those requiring translation by universities in European countries in which English is a non-native language. For each language combination, an in-domain bilingual corpus including course unit and degree program descriptions is used to train an MT engine, whose output is then compared to a baseline engine trained on the Europarl corpus. In a subsequent experiment, a bilingual terminology database is added to the training sets in both engines and its impact on the output quality is evaluated based on BLEU and post-editing score. Results suggest that the use of domain-specific corpora boosts the engines quality for both language combinations, especially for German-English, whereas adding terminological resources does not seem to bring notable benefits.