Efficient Unseen Language Adaptation for Multilingual Pre-Trained Language Models

Po-Heng Chen, Yun-Nung Chen


Abstract
Multilingual pre-trained language models (mPLMs) have demonstrated notable effectiveness in zero-shot cross-lingual transfer tasks. Specifically, they can be fine-tuned solely on tasks in the source language and subsequently applied to tasks in the target language. However, for low-resource languages unseen during pre-training, relying solely on zero-shot language transfer often yields sub-optimal results. One common strategy is to continue training PLMs using masked language modeling objectives on the target language. Nonetheless, this approach can be inefficient due to the need to adjust all parameters for language adaptation. In this paper, we propose a more efficient solution: soft-prompt tuning for language adaptation. Our experiments demonstrate that with carefully designed prompts, soft-prompt tuning enables mPLMs to achieve effective zero-shot cross-lingual transfer to downstream tasks in previously unseen languages. Notably, we found that prompt tuning outperforms continuously trained baselines on two text classification benchmarks, encompassing 20 low-resource languages while utilizing a mere 0.28% of the tuned parameters. These results underscore the superior adaptability of mPLMs to previously unseen languages afforded by soft-prompt tuning compared to traditional fine-tuning methods.
Anthology ID:
2024.emnlp-main.1057
Volume:
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
18983–18994
Language:
URL:
https://aclanthology.org/2024.emnlp-main.1057
DOI:
Bibkey:
Cite (ACL):
Po-Heng Chen and Yun-Nung Chen. 2024. Efficient Unseen Language Adaptation for Multilingual Pre-Trained Language Models. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 18983–18994, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Efficient Unseen Language Adaptation for Multilingual Pre-Trained Language Models (Chen & Chen, EMNLP 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.emnlp-main.1057.pdf