Multilingual Neural Machine Translation With the Right Amount of Sharing

Taido Purason, Andre Tättar


Abstract
Large multilingual Transformer-based machine translation models have had a pivotal role in making translation systems available for hundreds of languages with good zero-shot translation performance. One such example is the universal model with shared encoder-decoder architecture. Additionally, jointly trained language-specific encoder-decoder systems have been proposed for multilingual neural machine translation (NMT) models. This work investigates various knowledge-sharing approaches on the encoder side while keeping the decoder language- or language-group-specific. We propose a novel approach, where we use universal, language-group-specific and language-specific modules to solve the shortcomings of both the universal models and models with language-specific encoders-decoders. Experiments on a multilingual dataset set up to model real-world scenarios, including zero-shot and low-resource translation, show that our proposed models achieve higher translation quality compared to purely universal and language-specific approaches.
Anthology ID:
2022.eamt-1.12
Volume:
Proceedings of the 23rd Annual Conference of the European Association for Machine Translation
Month:
June
Year:
2022
Address:
Ghent, Belgium
Venue:
EAMT
SIG:
Publisher:
European Association for Machine Translation
Note:
Pages:
91–100
Language:
URL:
https://aclanthology.org/2022.eamt-1.12
DOI:
Bibkey:
Cite (ACL):
Taido Purason and Andre Tättar. 2022. Multilingual Neural Machine Translation With the Right Amount of Sharing. In Proceedings of the 23rd Annual Conference of the European Association for Machine Translation, pages 91–100, Ghent, Belgium. European Association for Machine Translation.
Cite (Informal):
Multilingual Neural Machine Translation With the Right Amount of Sharing (Purason & Tättar, EAMT 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.eamt-1.12.pdf