Can Domains Be Transferred across Languages in Multi-Domain Multilingual Neural Machine Translation?

Thuy-trang Vu, Shahram Khadivi, Xuanli He, Dinh Phung, Gholamreza Haffari


Abstract
Previous works mostly focus on either multilingual or multi-domain aspects of neural machine translation (NMT). This paper investigates whether the domain information can be transferred across languages on the composition of multi-domain and multilingual NMT, particularly for the incomplete data condition where in-domain bitext is missing for some language pairs. Our results in the curated leave-one-domain-out experiments show that multi-domain multilingual (MDML) NMT can boost zero-shot translation performance up to +10 gains on BLEU, as well as aid the generalisation of multi-domain NMT to the missing domain. We also explore strategies for effective integration of multilingual and multi-domain NMT, including language and domain tag combination and auxiliary task training. We find that learning domain-aware representations and adding target-language tags to the encoder leads to effective MDML-NMT.
Anthology ID:
2022.wmt-1.34
Volume:
Proceedings of the Seventh Conference on Machine Translation (WMT)
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates (Hybrid)
Editors:
Philipp Koehn, Loïc Barrault, Ondřej Bojar, Fethi Bougares, Rajen Chatterjee, Marta R. Costa-jussà, Christian Federmann, Mark Fishel, Alexander Fraser, Markus Freitag, Yvette Graham, Roman Grundkiewicz, Paco Guzman, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Tom Kocmi, André Martins, Makoto Morishita, Christof Monz, Masaaki Nagata, Toshiaki Nakazawa, Matteo Negri, Aurélie Névéol, Mariana Neves, Martin Popel, Marco Turchi, Marcos Zampieri
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
381–396
Language:
URL:
https://aclanthology.org/2022.wmt-1.34
DOI:
Bibkey:
Cite (ACL):
Thuy-trang Vu, Shahram Khadivi, Xuanli He, Dinh Phung, and Gholamreza Haffari. 2022. Can Domains Be Transferred across Languages in Multi-Domain Multilingual Neural Machine Translation?. In Proceedings of the Seventh Conference on Machine Translation (WMT), pages 381–396, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
Cite (Informal):
Can Domains Be Transferred across Languages in Multi-Domain Multilingual Neural Machine Translation? (Vu et al., WMT 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.wmt-1.34.pdf
Video:
 https://aclanthology.org/2022.wmt-1.34.mp4