Bilingual Synchronization: Restoring Translational Relationships with Editing Operations

Jitao Xu, Josep Crego, François Yvon


Abstract
Machine Translation (MT) is usually viewed as a one-shot process that generates the target language equivalent of some source text from scratch. We consider here a more general setting which assumes an initial target sequence, that must be transformed into a valid translation of the source, thereby restoring parallelism between source and target. For this bilingual synchronization task, we consider several architectures (both autoregressive and non-autoregressive) and training regimes, and experiment with multiple practical settings such as simulated interactive MT, translating with Translation Memory (TM) and TM cleaning. Our results suggest that one single generic edit-based system, once fine-tuned, can compare with, or even outperform, dedicated systems specifically trained for these tasks.
Anthology ID:
2022.emnlp-main.548
Volume:
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Editors:
Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
8016–8030
Language:
URL:
https://aclanthology.org/2022.emnlp-main.548
DOI:
10.18653/v1/2022.emnlp-main.548
Bibkey:
Cite (ACL):
Jitao Xu, Josep Crego, and François Yvon. 2022. Bilingual Synchronization: Restoring Translational Relationships with Editing Operations. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 8016–8030, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
Bilingual Synchronization: Restoring Translational Relationships with Editing Operations (Xu et al., EMNLP 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.emnlp-main.548.pdf