2024
pdf
bib
abs
Measuring Bias in Instruction-Following Models with ItaP-AT for the Italian Language
Dario Onorati
|
Davide Venditti
|
Elena Sofia Ruzzetti
|
Federico Ranaldi
|
Leonardo Ranaldi
|
Fabio Massimo Zanzotto
Proceedings of the 10th Italian Conference on Computational Linguistics (CLiC-it 2024)
Instruction-Following Language Models (IFLMs) are the state-of-the-art for solving many downstream tasks. Given their widespread use, there is an urgent need to measure whether the sentences they generate contain toxic information or social biases. In this paper, we propose Prompt Association Test for the Italian language (ItaP-AT): a new resource for testing the presence of social bias in different domains in IFLMs. This work also aims to understand whether it is possible to make the responses of these models more fair by using context learning, using “one-shot anti-stereotypical prompts”.
pdf
bib
abs
Assessing the Asymmetric Behaviour of Italian Large Language Models across Different Syntactic Structures
Elena Sofia Ruzzetti
|
Federico Ranaldi
|
Dario Onorati
|
Davide Venditti
|
Leonardo Ranaldi
|
Tommaso Caselli
|
Fabio Massimo Zanzotto
Proceedings of the 10th Italian Conference on Computational Linguistics (CLiC-it 2024)
While LLMs get more proficient at solving tasks and generating sentences, we aim to investigate the role that differentsyntactic structures have on models’ performances on a battery of Natural Language Understanding tasks. We analyze theperformance of five LLMs on semantically equivalent sentences that are characterized by different syntactic structures. Tocorrectly solve the tasks, a model is implicitly required to correctly parse the sentence. We found out that LLMs strugglewhen there are more complex syntactic structures, with an average drop of 16.13(±11.14) points in accuracy on Q&A task.Additionally, we propose a method based on token attribution to spot which area of the LLMs encode syntactic knowledge,by identifying model heads and layers responsible for the generation of a correct answer
pdf
bib
abs
A Trip Towards Fairness: Bias and De-Biasing in Large Language Models
Leonardo Ranaldi
|
Elena Sofia Ruzzetti
|
Davide Venditti
|
Dario Onorati
|
Fabio Massimo Zanzotto
Proceedings of the 13th Joint Conference on Lexical and Computational Semantics (*SEM 2024)
Cheap-to-Build Very Large-Language Models (CtB-LLMs) with affordable training are emerging as the next big revolution in natural language processing and understanding. These CtB-LLMs are democratizing access to trainable Very Large-Language Models (VLLMs) and, thus, may represent the building blocks of many NLP systems solving downstream tasks. Hence, a little or a large bias in CtB-LLMs may cause huge harm. In this paper, we performed a large investigation of the bias of three families of CtB-LLMs, and we showed that debiasing techniques are effective and usable. Indeed, according to current tests, the LLaMA and the OPT families have an important bias in gender, race, religion, and profession. In contrast to the analysis for other LMMs, we discovered that bias depends not on the number of parameters but on the perplexity. Finally, the debiasing of OPT using LORA reduces bias up to 4.12 points in the normalized stereotype score.
2023
pdf
bib
abs
Measuring bias in Instruction-Following models with P-AT
Dario Onorati
|
Elena Sofia Ruzzetti
|
Davide Venditti
|
Leonardo Ranaldi
|
Fabio Massimo Zanzotto
Findings of the Association for Computational Linguistics: EMNLP 2023
Instruction-Following Language Models (IFLMs) are promising and versatile tools for solving many downstream, information-seeking tasks. Given their success, there is an urgent need to have a shared resource to determine whether existing and new IFLMs are prone to produce biased language interactions. In this paper, we propose Prompt Association Test (P-AT): a new resource for testing the presence of social biases in IFLMs. P-AT stems from WEAT (Caliskan et al., 2017) and generalizes the notion of measuring social biases to IFLMs. Basically, we cast WEAT word tests in promptized classification tasks, and we associate a metric - the bias score. Our resource consists of 2310 prompts. We then experimented with several families of IFLMs discovering gender and race biases in all the analyzed models. We expect P-AT to be an important tool for quantifying bias across different dimensions and, therefore, for encouraging the creation of fairer IFLMs before their distortions have consequences in the real world.