A Taxonomy and Study of Critical Errors in Machine Translation

Khetam Al Sharou, Lucia Specia


Abstract
Not all machine mistranslations are equal. For example, mistranslating a date or time in an appointment, mistranslating the number or currency in a contract, or hallucinating profanity may lead to consequences for the users even when MT is just used for gisting. The severity of the errors is important, but overlooked, aspect of MT quality evaluation. In this paper, we present the result of our effort to bring awareness to the problem of critical translation errors. We study, validate and improve an initial taxonomy of critical errors with the view of providing guidance for critical error analysis, annotation and mitigation. We test the taxonomy for three different languages to examine to what extent it generalises across languages. We provide an account of factors that affect annotation tasks along with recommendations on how to improve the practice in future work. We also study the impact of the source text on generating critical errors in the translation and, based on this, propose a set of recommendations on aspects of the MT that need further scrutiny, especially for user-generated content, to avoid generating such errors, and hence improve online communication.
Anthology ID:
2022.eamt-1.20
Volume:
Proceedings of the 23rd Annual Conference of the European Association for Machine Translation
Month:
June
Year:
2022
Address:
Ghent, Belgium
Editors:
Helena Moniz, Lieve Macken, Andrew Rufener, Loïc Barrault, Marta R. Costa-jussà, Christophe Declercq, Maarit Koponen, Ellie Kemp, Spyridon Pilos, Mikel L. Forcada, Carolina Scarton, Joachim Van den Bogaert, Joke Daems, Arda Tezcan, Bram Vanroy, Margot Fonteyne
Venue:
EAMT
SIG:
Publisher:
European Association for Machine Translation
Note:
Pages:
171–180
Language:
URL:
https://aclanthology.org/2022.eamt-1.20
DOI:
Bibkey:
Cite (ACL):
Khetam Al Sharou and Lucia Specia. 2022. A Taxonomy and Study of Critical Errors in Machine Translation. In Proceedings of the 23rd Annual Conference of the European Association for Machine Translation, pages 171–180, Ghent, Belgium. European Association for Machine Translation.
Cite (Informal):
A Taxonomy and Study of Critical Errors in Machine Translation (Sharou & Specia, EAMT 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.eamt-1.20.pdf