SimQA: Detecting Simultaneous MT Errors through Word-by-Word Question Answering

HyoJung Han, Marine Carpuat, Jordan Boyd-Graber


Abstract
Detractors of neural machine translation admit that while its translations are fluent, it sometimes gets key facts wrong. This is particularly important in simultaneous interpretation where translations have to be provided as fast as possible: before a sentence is complete. Yet, evaluations of simultaneous machine translation (SimulMT) fail to capture if systems correctly translate the most salient elements of a question: people, places, and dates. To address this problem, we introduce a downstream word-by-word question answering evaluation task (SimQA): given a source language question, translate the question word by word into the target language, and answer as soon as possible. SimQA jointly measures whether the SimulMT models translate the question quickly and accurately, and can reveal shortcomings in existing neural systems—hallucinating or omitting facts.
Anthology ID:
2022.emnlp-main.378
Volume:
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Editors:
Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5598–5616
Language:
URL:
https://aclanthology.org/2022.emnlp-main.378
DOI:
10.18653/v1/2022.emnlp-main.378
Bibkey:
Cite (ACL):
HyoJung Han, Marine Carpuat, and Jordan Boyd-Graber. 2022. SimQA: Detecting Simultaneous MT Errors through Word-by-Word Question Answering. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 5598–5616, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
SimQA: Detecting Simultaneous MT Errors through Word-by-Word Question Answering (Han et al., EMNLP 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.emnlp-main.378.pdf