A Gentle Push Funziona Benissimo: Making Instructed Models in Italian via Contrastive Activation Steering

Daniel Scalena, Elisabetta Fersini, Malvina Nissim


Abstract
Adapting models to a language that was only partially present in the pre-training data requires fine-tuning, which is expensive in terms of both data and computational resources. As an alternative to fine-tuning, we explore the potential of activation steering-based techniques to enhance model performance on Italian tasks. Through our experiments we show that Italian steering (i) can be successfully applied to different models, (ii) achieves performances comparable to, or even better than, fine-tuned models for Italian, and (iii) yields higher quality and consistency in Italian generations. We also discuss the utility of steering and fine-tuning in the contemporary LLM landscape where models are anyway getting high Italian performances even if not explicitly trained in this language.
Anthology ID:
2024.clicit-1.98
Volume:
Proceedings of the 10th Italian Conference on Computational Linguistics (CLiC-it 2024)
Month:
December
Year:
2024
Address:
Pisa, Italy
Editors:
Felice Dell'Orletta, Alessandro Lenci, Simonetta Montemagni, Rachele Sprugnoli
Venue:
CLiC-it
SIG:
Publisher:
CEUR Workshop Proceedings
Note:
Pages:
909–920
Language:
URL:
https://aclanthology.org/2024.clicit-1.98/
DOI:
Bibkey:
Cite (ACL):
Daniel Scalena, Elisabetta Fersini, and Malvina Nissim. 2024. A Gentle Push Funziona Benissimo: Making Instructed Models in Italian via Contrastive Activation Steering. In Proceedings of the 10th Italian Conference on Computational Linguistics (CLiC-it 2024), pages 909–920, Pisa, Italy. CEUR Workshop Proceedings.
Cite (Informal):
A Gentle Push Funziona Benissimo: Making Instructed Models in Italian via Contrastive Activation Steering (Scalena et al., CLiC-it 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.clicit-1.98.pdf