Decoupled Vocabulary Learning Enables Zero-Shot Translation from Unseen Languages

Carlos Mullov, Quan Pham, Alexander Waibel


Abstract
Multilingual neural machine translation systems learn to map sentences of different languages into a common representation space. Intuitively, with a growing number of seen languages the encoder sentence representation grows more flexible and easily adaptable to new languages. In this work, we test this hypothesis by zero-shot translating from unseen languages. To deal with unknown vocabularies from unknown languages we propose a setup where we decouple learning of vocabulary and syntax, i.e. for each language we learn word representations in a separate step (using cross-lingual word embeddings), and then train to translate while keeping those word representations frozen. We demonstrate that this setup enables zero-shot translation from entirely unseen languages. Zero-shot translating with a model trained on Germanic and Romance languages we achieve scores of 42.6 BLEU for Portuguese-English and 20.7 BLEU for Russian-English on TED domain. We explore how this zero-shot translation capability develops with varying number of languages seen by the encoder. Lastly, we explore the effectiveness of our decoupled learning strategy for unsupervised machine translation. By exploiting our model’s zero-shot translation capability for iterative back-translation we attain near parity with a supervised setting.
Anthology ID:
2024.luhme-long.362
Volume:
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6693–6709
Language:
URL:
https://aclanthology.org/2024.luhme-long.362/
DOI:
10.18653/v1/2024.acl-long.362
Bibkey:
Cite (ACL):
Carlos Mullov, Quan Pham, and Alexander Waibel. 2024. Decoupled Vocabulary Learning Enables Zero-Shot Translation from Unseen Languages. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6693–6709, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
Decoupled Vocabulary Learning Enables Zero-Shot Translation from Unseen Languages (Mullov et al., ACL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.acl-long.362.pdf