It’s not a Non-Issue: Negation as a Source of Error in Machine Translation

Md Mosharaf Hossain, Antonios Anastasopoulos, Eduardo Blanco, Alexis Palmer


Abstract
As machine translation (MT) systems progress at a rapid pace, questions of their adequacy linger. In this study we focus on negation, a universal, core property of human language that significantly affects the semantics of an utterance. We investigate whether translating negation is an issue for modern MT systems using 17 translation directions as test bed. Through thorough analysis, we find that indeed the presence of negation can significantly impact downstream quality, in some cases resulting in quality reductions of more than 60%. We also provide a linguistically motivated analysis that directly explains the majority of our findings. We release our annotations and code to replicate our analysis here: https://github.com/mosharafhossain/negation-mt.
Anthology ID:
2020.findings-emnlp.345
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2020
Month:
November
Year:
2020
Address:
Online
Editors:
Trevor Cohn, Yulan He, Yang Liu
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3869–3885
Language:
URL:
https://aclanthology.org/2020.findings-emnlp.345
DOI:
10.18653/v1/2020.findings-emnlp.345
Bibkey:
Cite (ACL):
Md Mosharaf Hossain, Antonios Anastasopoulos, Eduardo Blanco, and Alexis Palmer. 2020. It’s not a Non-Issue: Negation as a Source of Error in Machine Translation. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 3869–3885, Online. Association for Computational Linguistics.
Cite (Informal):
It’s not a Non-Issue: Negation as a Source of Error in Machine Translation (Hossain et al., Findings 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.findings-emnlp.345.pdf
Code
 mosharafhossain/negation-mt