Neuron Specialization: Leveraging Intrinsic Task Modularity for Multilingual Machine Translation

Shaomu Tan, Di Wu, Christof Monz


Abstract
Training a unified multilingual model promotes knowledge transfer but inevitably introduces negative interference. Language-specific modeling methods show promise in reducing interference. However, they often rely on heuristics to distribute capacity and struggle to foster cross-lingual transfer via isolated modules. In this paper, we explore intrinsic task modularity within multilingual networks and leverage these observations to circumvent interference under multilingual translation. We show that neurons in the feed-forward layers tend to be activated in a language-specific manner. Meanwhile, these specialized neurons exhibit structural overlaps that reflect language proximity, which progress across layers. Based on these findings, we propose Neuron Specialization, an approach that identifies specialized neurons to modularize feed-forward layers and then continuously updates them through sparse networks. Extensive experiments show that our approach achieves consistent performance gains over strong baselines with additional analyses demonstrating reduced interference and increased knowledge transfer.
Anthology ID:
2024.emnlp-main.374
Volume:
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6506–6527
Language:
URL:
https://aclanthology.org/2024.emnlp-main.374
DOI:
Bibkey:
Cite (ACL):
Shaomu Tan, Di Wu, and Christof Monz. 2024. Neuron Specialization: Leveraging Intrinsic Task Modularity for Multilingual Machine Translation. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 6506–6527, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Neuron Specialization: Leveraging Intrinsic Task Modularity for Multilingual Machine Translation (Tan et al., EMNLP 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.emnlp-main.374.pdf