Making Instruction Finetuning Accessible to Non-English Languages: A Case Study on Swedish Models

Oskar Holmström, Ehsan Doostmohammadi


Abstract
In recent years, instruction finetuning models have received increased attention due to their remarkable zero-shot and generalization capabilities. However, the widespread implementation of these models has been limited to the English language, largely due to the costs and challenges associated with creating instruction datasets. To overcome this, automatic instruction generation has been proposed as a resourceful alternative. We see this as an opportunity for the adoption of instruction finetuning for other languages. In this paper we explore the viability of instruction finetuning for Swedish. We translate a dataset of generated instructions from English to Swedish, using it to finetune both Swedish and non-Swedish models. Results indicate that the use of translated instructions significantly improves the models’ zero-shot performance, even on unseen data, while staying competitive with strong baselines ten times in size. We see this paper is a first step and a proof of concept that instruction finetuning for Swedish is within reach, through resourceful means, and that there exist several directions for further improvements.
Anthology ID:
2023.nodalida-1.62
Volume:
Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa)
Month:
May
Year:
2023
Address:
Tórshavn, Faroe Islands
Editors:
Tanel Alumäe, Mark Fishel
Venue:
NoDaLiDa
SIG:
Publisher:
University of Tartu Library
Note:
Pages:
634–642
Language:
URL:
https://aclanthology.org/2023.nodalida-1.62
DOI:
Bibkey:
Cite (ACL):
Oskar Holmström and Ehsan Doostmohammadi. 2023. Making Instruction Finetuning Accessible to Non-English Languages: A Case Study on Swedish Models. In Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa), pages 634–642, Tórshavn, Faroe Islands. University of Tartu Library.
Cite (Informal):
Making Instruction Finetuning Accessible to Non-English Languages: A Case Study on Swedish Models (Holmström & Doostmohammadi, NoDaLiDa 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.nodalida-1.62.pdf