Languages Transferred Within the Encoder: On Representation Transfer in Zero-Shot Multilingual Translation

Zhi Qu; Chenchen Ding; Taro Watanabe

Languages Transferred Within the Encoder: On Representation Transfer in Zero-Shot Multilingual Translation

Abstract

Understanding representation transfer in multilingual neural machine translation (MNMT) can reveal the reason for the zero-shot translation deficiency. In this work, we systematically analyze the representational issue of MNMT models. We first introduce the identity pair, translating a sentence to itself, to address the lack of the base measure in multilingual investigations, as the identity pair can reflect the representation of a language within the model. Then, we demonstrate that the encoder transfers the source language to the representational subspace of the target language instead of the language-agnostic state. Thus, the zero-shot translation deficiency arises because the representation of a translation is entangled with other languages and not transferred to the target language effectively. Based on our findings, we propose two methods: 1) low-rank language-specific embedding at the encoder, and 2) language-specific contrastive learning of the representation at the decoder. The experimental results on Europarl-15, TED-19, and OPUS-100 datasets show that our methods substantially enhance the performance of zero-shot translations without sacrifices in supervised directions by improving language transfer capacity, thereby providing practical evidence to support our conclusions. Codes are available at https://github.com/zhiqu22/ZeroTrans.

Anthology ID:: 2025.mtsummit-1.7
Volume:: Proceedings of Machine Translation Summit XX: Volume 1
Month:: June
Year:: 2025
Address:: Geneva, Switzerland
Editors:: Pierrette Bouillon, Johanna Gerlach, Sabrina Girletti, Lise Volkart, Raphael Rubino, Rico Sennrich, Ana C. Farinha, Marco Gaido, Joke Daems, Dorothy Kenny, Helena Moniz, Sara Szoc
Venue:: MTSummit
SIG:
Publisher:: European Association for Machine Translation
Note:
Pages:: 81–98
Language:
URL:: https://aclanthology.org/2025.mtsummit-1.7/
DOI:
Bibkey:
Cite (ACL):: Zhi Qu, Chenchen Ding, and Taro Watanabe. 2025. Languages Transferred Within the Encoder: On Representation Transfer in Zero-Shot Multilingual Translation. In Proceedings of Machine Translation Summit XX: Volume 1, pages 81–98, Geneva, Switzerland. European Association for Machine Translation.
Cite (Informal):: Languages Transferred Within the Encoder: On Representation Transfer in Zero-Shot Multilingual Translation (Qu et al., MTSummit 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.mtsummit-1.7.pdf

PDF Cite Search Fix data