Hyper-X: A Unified Hypernetwork for Multi-Task Multilingual Transfer

Ahmet Üstün; Arianna Bisazza; Gosse Bouma; Gertjan van Noord; Sebastian Ruder

doi:10.18653/v1/2022.emnlp-main.541

Hyper-X: A Unified Hypernetwork for Multi-Task Multilingual Transfer

Ahmet Üstün, Arianna Bisazza, Gosse Bouma, Gertjan van Noord, Sebastian Ruder

Abstract

Massively multilingual models are promising for transfer learning across tasks and languages. However, existing methods are unable to fully leverage training data when it is available in different task-language combinations. To exploit such heterogeneous supervision, we propose Hyper-X, a single hypernetwork that unifies multi-task and multilingual learning with efficient adaptation. It generates weights for adapter modules conditioned on both tasks and language embeddings. By learning to combine task and language-specific knowledge, our model enables zero-shot transfer for unseen languages and task-language combinations. Our experiments on a diverse set of languages demonstrate that Hyper-X achieves the best or competitive gain when a mixture of multiple resources is available, while on par with strong baseline in the standard scenario. Hyper-X is also considerably more efficient in terms of parameters and resources compared to methods that train separate adapters. Finally, Hyper-X consistently produces strong results in few-shot scenarios for new languages, showing the versatility of our approach beyond zero-shot transfer.

Anthology ID:: 2022.emnlp-main.541
Volume:: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Month:: December
Year:: 2022
Address:: Abu Dhabi, United Arab Emirates
Editors:: Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 7934–7949
Language:
URL:: https://aclanthology.org/2022.emnlp-main.541
DOI:: 10.18653/v1/2022.emnlp-main.541
Bibkey:
Cite (ACL):: Ahmet Üstün, Arianna Bisazza, Gosse Bouma, Gertjan van Noord, and Sebastian Ruder. 2022. Hyper-X: A Unified Hypernetwork for Multi-Task Multilingual Transfer. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 7934–7949, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):: Hyper-X: A Unified Hypernetwork for Multi-Task Multilingual Transfer (Üstün et al., EMNLP 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.emnlp-main.541.pdf

PDF Cite Search