2024
pdf
bib
abs
Multi-property Steering of Large Language Models with Dynamic Activation Composition
Daniel Scalena
|
Gabriele Sarti
|
Malvina Nissim
Proceedings of the 7th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP
Activation steering methods were shown to be effective in conditioning language model generation by additively intervening over models’ intermediate representations. However, the evaluation of these techniques has so far been limited to single conditioning properties and synthetic settings. In this work, we conduct a comprehensive evaluation of various activation steering strategies, highlighting the property-dependent nature of optimal parameters to ensure a robust effect throughout generation. To address this issue, we propose Dynamic Activation Composition, an information-theoretic approach to modulate the steering intensity of one or more properties throughout generation. Our experiments on multi-property steering show that our method successfully maintains high conditioning while minimizing the impact of conditioning on generation fluency.
pdf
bib
abs
A Gentle Push Funziona Benissimo: Making Instructed Models in Italian via Contrastive Activation Steering
Daniel Scalena
|
Elisabetta Fersini
|
Malvina Nissim
Proceedings of the 10th Italian Conference on Computational Linguistics (CLiC-it 2024)
Adapting models to a language that was only partially present in the pre-training data requires fine-tuning, which is expensive in terms of both data and computational resources. As an alternative to fine-tuning, we explore the potential of activation steering-based techniques to enhance model performance on Italian tasks. Through our experiments we show that Italian steering (i) can be successfully applied to different models, (ii) achieves performances comparable to, or even better than, fine-tuned models for Italian, and (iii) yields higher quality and consistency in Italian generations. We also discuss the utility of steering and fine-tuning in the contemporary LLM landscape where models are anyway getting high Italian performances even if not explicitly trained in this language.
pdf
bib
abs
CALAMITA: Challenge the Abilities of LAnguage Models in ITAlian
Giuseppe Attanasio
|
Pierpaolo Basile
|
Federico Borazio
|
Danilo Croce
|
Maria Francis
|
Jacopo Gili
|
Elio Musacchio
|
Malvina Nissim
|
Viviana Patti
|
Matteo Rinaldi
|
Daniel Scalena
Proceedings of the 10th Italian Conference on Computational Linguistics (CLiC-it 2024)
The rapid development of Large Language Models (LLMs) has called for robust benchmarks to assess their abilities, track progress, and compare iterations. While existing benchmarks provide extensive evaluations across diverse tasks, they predominantly focus on English, leaving other languages underserved. For Italian, the EVALITA campaigns have provided a long-standing tradition of classification-focused shared tasks. However, their scope does not fully align with the nuanced evaluation required for modern LLMs. To address this gap, we introduce “Challenge the Abilities of LAnguage Models in ITAlian” (CALAMITA), a collaborative effort to create a dynamic and growing benchmark tailored to Italian. CALAMITA emphasizes diversity in task design to test a wide range of LLM capabilities through resources natively developed in Italian by the community. This initiative includes a shared platform, live leaderboard, and centralized evaluation framework. This paper outlines the collaborative process, initial challenges, and evaluation framework of CALAMITA.
2023
pdf
bib
abs
MIND at SemEval-2023 Task 11: From Uncertain Predictions to Subjective Disagreement
Giulia Rizzi
|
Alessandro Astorino
|
Daniel Scalena
|
Paolo Rosso
|
Elisabetta Fersini
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
This paper describes the participation of the research laboratory MIND, at the University of Milano-Bicocca, in the SemEval 2023 task related to Learning With Disagreements (Le-Wi-Di). The main goal is to identify the level of agreement/disagreement from a collection of textual datasets with different characteristics in terms of style, language and task. The proposed approach is grounded on the hypothesis that the disagreement between annotators could be grasped by the uncertainty that a model, based on several linguistic characteristics, could have on the prediction of a given gold label.