That was the last straw, we need more: Are Translation Systems Sensitive to Disambiguating Context?

Jaechan Lee, Alisa Liu, Orevaoghene Ahia, Hila Gonen, Noah Smith


Abstract
The translation of ambiguous text presents a challenge for translation systems, as it requires using the surrounding context to disambiguate the intended meaning as much as possible. While prior work has studied ambiguities that result from different grammatical features of the source and target language, we study semantic ambiguities that exist in the source (English in this work) itself. In particular, we focus on idioms that are open to both literal and figurative interpretations (e.g., goose egg), and collect TIDE, a dataset of 512 pairs of English sentences containing idioms with disambiguating context such that one is literal (it laid a goose egg) and another is figurative (they scored a goose egg, as in a score of zero). In experiments, we compare MT-specific models and language models for (i) their preference when given an ambiguous subsentence, (ii) their sensitivity to disambiguating context, and (iii) the performance disparity between figurative and literal source sentences. We find that current MT models consistently translate English idioms literally, even when the context suggests a figurative interpretation. On the other hand, LMs are far more context-aware, although there remain disparities across target languages. Our findings underline the potential of LMs as a strong backbone for context-aware translation.
Anthology ID:
2023.findings-emnlp.302
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2023
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4555–4569
Language:
URL:
https://aclanthology.org/2023.findings-emnlp.302
DOI:
10.18653/v1/2023.findings-emnlp.302
Bibkey:
Cite (ACL):
Jaechan Lee, Alisa Liu, Orevaoghene Ahia, Hila Gonen, and Noah Smith. 2023. That was the last straw, we need more: Are Translation Systems Sensitive to Disambiguating Context?. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 4555–4569, Singapore. Association for Computational Linguistics.
Cite (Informal):
That was the last straw, we need more: Are Translation Systems Sensitive to Disambiguating Context? (Lee et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-emnlp.302.pdf