Contextual Selection of Pseudo-terminology Constraints for Terminology-aware Neural Machine Translation in the IT Domain

Benjamin Pong

doi:10.18653/v1/2025.wmt-1.109

Contextual Selection of Pseudo-terminology Constraints for Terminology-aware Neural Machine Translation in the IT Domain

Abstract

This system paper describes the development of a Neural Machine Translation system that is adapted to the Information Technology (IT) domain, and is able to translate specialized IT-related terminologies. Despite the popularity of incorporating terminology constraints at training time to develop terminology-aware Neural Machine Translation engines, one of the main issues is: In the absence of terminology references for training, and with the proliferation of source-target alignments, how does one select word alignments as pseudo-terminology constraints? The system in this work uses the encoder’s final hidden states as proxies for terminologies, and selects word alignments with the highest norm as pseudo-terminology constraints for inline annotation at run-time. It compares this context-based approach against a conventional statistical approach, where terminology-constraints are selected based on a low-frequency threshold. The systems were evaluated for general translation quality and Terminology Success Rates, with results that validate the effectiveness of the contextual approach.

Anthology ID:: 2025.wmt-1.109
Volume:: Proceedings of the Tenth Conference on Machine Translation
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Barry Haddow, Tom Kocmi, Philipp Koehn, Christof Monz
Venue:: WMT
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1292–1301
Language:
URL:: https://aclanthology.org/2025.wmt-1.109/
DOI:: 10.18653/v1/2025.wmt-1.109
Bibkey:
Cite (ACL):: Benjamin Pong. 2025. Contextual Selection of Pseudo-terminology Constraints for Terminology-aware Neural Machine Translation in the IT Domain. In Proceedings of the Tenth Conference on Machine Translation, pages 1292–1301, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Contextual Selection of Pseudo-terminology Constraints for Terminology-aware Neural Machine Translation in the IT Domain (Pong, WMT 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.wmt-1.109.pdf

PDF Cite Search Fix data