Naime Şeyma Erdem
Translating Between Morphologically Rich Languages: An Arabic-to-Turkish Machine Translation System
İlknur Durgar El-Kahlout | Emre Bektaş | Naime Şeyma Erdem | Hamza Kaya
Proceedings of the Fourth Arabic Natural Language Processing Workshop
This paper introduces the work on building a machine translation system for Arabic-to-Turkish in the news domain. Our work includes collecting parallel datasets in several ways for a new and low-resourced language pair, building baseline systems with state-of-the-art architectures and developing language specific algorithms for better translation. Parallel datasets are mainly collected three different ways; i) translating Arabic texts into Turkish by professional translators, ii) exploiting the web for open-source Arabic-Turkish parallel texts, iii) using back-translation. We per-formed preliminary experiments for Arabic-to-Turkish machine translation with neural(Marian) machine translation tools with a novel morphologically motivated vocabulary reduction method.