Automatic Detection of Translation Direction

Ilia Sominsky, Shuly Wintner


Abstract
Parallel corpora are crucial resources for NLP applications, most notably for machine translation. The direction of the (human) translation of parallel corpora has been shown to have significant implications for the quality of statistical machine translation systems that are trained with such corpora. We describe a method for determining the direction of the (manual) translation of parallel corpora at the sentence-pair level. Using several linguistically-motivated features, coupled with a neural network model, we obtain high accuracy on several language pairs. Furthermore, we demonstrate that the accuracy is correlated with the (typological) distance between the two languages.
Anthology ID:
R19-1130
Volume:
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)
Month:
September
Year:
2019
Address:
Varna, Bulgaria
Editors:
Ruslan Mitkov, Galia Angelova
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd.
Note:
Pages:
1131–1140
Language:
URL:
https://aclanthology.org/R19-1130
DOI:
10.26615/978-954-452-056-4_130
Bibkey:
Cite (ACL):
Ilia Sominsky and Shuly Wintner. 2019. Automatic Detection of Translation Direction. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019), pages 1131–1140, Varna, Bulgaria. INCOMA Ltd..
Cite (Informal):
Automatic Detection of Translation Direction (Sominsky & Wintner, RANLP 2019)
Copy Citation:
PDF:
https://aclanthology.org/R19-1130.pdf