Understanding and Improving the Robustness of Terminology Constraints in Neural Machine Translation

Huaao Zhang; Qiang Wang; Bo Qin; Zelin Shi; Haibo Wang; Ming Chen

doi:10.18653/v1/2023.acl-long.332

Understanding and Improving the Robustness of Terminology Constraints in Neural Machine Translation

Huaao Zhang, Qiang Wang, Bo Qin, Zelin Shi, Haibo Wang, Ming Chen

Abstract

In this work, we study the robustness of two typical terminology translation methods: Placeholder (PH) and Code-Switch (CS), concerning (1) the number of constraints and (2) the target constraint length. We identify that existing terminology constraint test sets, such as IATE, Wiktionary, and TICO, are blind to this issue due to oversimplified constraint settings. To solve it, we create a new challenging test set of English-German, increasing the average constraint count per sentence from 1.1~1.7 to 6.1 and the length per target constraint from 1.1~1.2 words to 3.4 words. Then we find that PH and CS methods degrade as the number of constraints increases, but they have complementary strengths. Specifically, PH is better at retaining high constraint accuracy but lower translation quality as measured by BLEU and COMET scores. In contrast, CS has the opposite results. Based on these observations, we propose a simple but effective method combining the advantages of PH and CS. This approach involves training a model like PH to predict the term labels, and then during inference replacing those labels with target terminology text like CS, so that the subsequent generation is aware of the target term content. Extensive experimental results show that this approach can achieve high constraint accuracy and translation quality simultaneously, regardless of the number or length of constraints.

Anthology ID:: 2023.acl-long.332
Volume:: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2023
Address:: Toronto, Canada
Editors:: Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 6029–6042
Language:
URL:: https://aclanthology.org/2023.acl-long.332
DOI:: 10.18653/v1/2023.acl-long.332
Bibkey:
Cite (ACL):: Huaao Zhang, Qiang Wang, Bo Qin, Zelin Shi, Haibo Wang, and Ming Chen. 2023. Understanding and Improving the Robustness of Terminology Constraints in Neural Machine Translation. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6029–6042, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):: Understanding and Improving the Robustness of Terminology Constraints in Neural Machine Translation (Zhang et al., ACL 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.acl-long.332.pdf
Video:: https://aclanthology.org/2023.acl-long.332.mp4

PDF Cite Search Video