BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting

Zheng Xin Yong; Hailey Schoelkopf; Niklas Muennighoff; Alham Fikri Aji; David Ifeoluwa Adelani; Khalid Almubarak; M Saiful Bari; Lintang Sutawika; Jungo Kasai; Ahmed Baruwa; Genta Indra Winata; Stella Biderman; Edward Raff; Dragomir Radev; Vassilina Nikoulina

doi:10.18653/v1/2023.acl-long.653

BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting

Zheng Xin Yong, Hailey Schoelkopf, Niklas Muennighoff, Alham Fikri Aji, David Ifeoluwa Adelani, Khalid Almubarak, M Saiful Bari, Lintang Sutawika, Jungo Kasai, Ahmed Baruwa, Genta Winata, Stella Biderman, Edward Raff, Dragomir Radev, Vassilina Nikoulina

Abstract

The BLOOM model is a large publicly available multilingual language model, but its pretraining was limited to 46 languages. To extend the benefits of BLOOM to other languages without incurring prohibitively large costs, it is desirable to adapt BLOOM to new languages not seen during pretraining. In this work, we apply existing language adaptation strategies to BLOOM and benchmark its zero-shot prompting performance on eight new languages in a resource-constrained setting. We find language adaptation to be effective at improving zero-shot performance in new languages. Surprisingly, we find that adapter-based finetuning is more effective than continued pretraining for large models. In addition, we discover that prompting performance is not significantly affected by language specifics, such as the writing system. It is primarily determined by the size of the language adaptation data. We also add new languages to BLOOMZ, which is a multitask finetuned version of BLOOM capable of following task instructions zero-shot. We find including a new language in the multitask fine-tuning mixture to be the most effective method to teach BLOOMZ a new language. We conclude that with sufficient training data language adaptation can generalize well to diverse languages. Our code is available at https://github.com/bigscience-workshop/multilingual-modeling.

Anthology ID:: 2023.acl-long.653
Volume:: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2023
Address:: Toronto, Canada
Editors:: Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 11682–11703
Language:
URL:: https://aclanthology.org/2023.acl-long.653/
DOI:: 10.18653/v1/2023.acl-long.653
Bibkey:
Cite (ACL):: Zheng Xin Yong, Hailey Schoelkopf, Niklas Muennighoff, Alham Fikri Aji, David Ifeoluwa Adelani, Khalid Almubarak, M Saiful Bari, Lintang Sutawika, Jungo Kasai, Ahmed Baruwa, Genta Winata, Stella Biderman, Edward Raff, Dragomir Radev, and Vassilina Nikoulina. 2023. BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 11682–11703, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):: BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting (Yong et al., ACL 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.acl-long.653.pdf
Video:: https://aclanthology.org/2023.acl-long.653.mp4

PDF Cite Search Video Fix data