Saba Ghanbari Haez

2025

pdf bib abs
Neutral Is Not Unbiased: Evaluating Implicit and Intersectional Identity Bias in LLMs Through Structured Narrative Scenarios
Saba Ghanbari Haez | Mauro Dragoni
Findings of the Association for Computational Linguistics: EMNLP 2025

Large Language Models often reproduce societal biases, yet most evaluations overlook how such biases evolve across nuanced contexts or intersecting identities. We introduce a scenario-based evaluation framework built on 100 narrative tasks, designed to be neutral at baseline and systematically modified with gender and age cues. Grounded in the theory of Normative-Narrative Scenarios, our approach provides ethically coherent and socially plausible settings for probing model behavior. Analyzing responses from five leading LLMs—GPT-4o, LLaMA 3.1, Qwen2.5, Phi-4, and Mistral—using Critical Discourse Analysis and quantitative linguistic metrics, we find consistent evidence of bias. Gender emerges as the dominant axis of bias, with intersectional cues (e.g., age and gender combined) further intensifying disparities. Our results underscore the value of dynamic narrative progression for detecting implicit, systemic biases in Large Language Models.

2024

pdf bib abs
Building Certified Medical Chatbots: Overcoming Unstructured Data Limitations with Modular RAG
Leonardo Sanna | Patrizio Bellan | Simone Magnolini | Marina Segala | Saba Ghanbari Haez | Monica Consolandi | Mauro Dragoni
Proceedings of the First Workshop on Patient-Oriented Language Processing (CL4Health) @ LREC-COLING 2024

Creating a certified conversational agent poses several issues. The need to manage fine-grained information delivery and the necessity to provide reliable medical information requires a notable effort, especially in dataset preparation. In this paper, we investigate the challenges of building a certified medical chatbot in Italian that provides information about pregnancy and early childhood. We show some negative initial results regarding the possibility of creating a certified conversational agent within the RASA framework starting from unstructured data. Finally, we propose a modular RAG model to implement a Large Language Model in a certified context, overcoming data limitations and enabling data collection on actual conversations.

Co-authors

Marina Segala 1

Venues

Fix author