Mirko Lai

2025

Are you sure? Measuring models bias in content moderation through uncertainty
Alessandra Urbinati | Mirko Lai | Simona Frenda | Marco Antonio Stranisci
Findings of the Association for Computational Linguistics: EMNLP 2025

Automatic content moderation is crucial to ensuring safety in social media. Language Model-based classifiers are increasingly adopted for this task, but it has been shown that they perpetuate racial and social biases. Even if several resources and benchmark corpora have been developed to challenge this issue, measuring the fairness of models in content moderation remains an open issue. In this work, we present an unsupervised approach that benchmarks models on the basis of their uncertainty in classifying messages annotated by people belonging to vulnerable groups. We use uncertainty, computed by means of the conformal prediction technique, as a proxy to analyze the bias of 11 models (LMs and LLMs) against women and non-white annotators and observe to what extent it diverges from metrics based on performance, such as the F1 score. The results show that some pre-trained models predict with high accuracy the labels coming from minority groups, even if the confidence in their prediction is low. Therefore, by measuring the confidence of models, we are able to see which groups of annotators are better represented in pre-trained models and lead the debiasing process of these models before their effective use.

2024

pdf bib abs

In recent years, the Gender Based Violence (GBV) has become an important issue in modern society and a central topic in different research areas due to its alarming spread. Several Natural Language Processing (NLP) studies, concerning Hate Speech directed against women, have focused on slurs or incel communities. The main contribution of our work is the creation of the first dataset on social media comments to GBV, in particular to a femicide event. Our dataset, named GBV-Maltesi, contains 2,934 YouTube comments annotated following a new schema that we developed in order to study GBV and misogyny with an intersectional approach. During the experimental phase, we trained models on different corpora for binary misogyny detection and found that datasets that mostly include explicit expressions of misogyny are an easier challenge, compared to more implicit forms of misogyny contained in GVB-Maltesi.

2023

pdf bib

DisaggregHate It Corpus: A Disaggregated Italian Dataset of Hate Speech
Marco Madeddu | Simona Frenda | Mirko Lai | Viviana Patti | Valerio Basile
Proceedings of the Ninth Italian Conference on Computational Linguistics (CLiC-it 2023)

pdf bib

Inters8: A Corpus to Study Misogyny and Intersectionality on Twitter
Ivan Spada | Mirko Lai | Viviana Patti
Proceedings of the Ninth Italian Conference on Computational Linguistics (CLiC-it 2023)

pdf bib

Debunker Assistant: A Support for Detecting Online Misinformation
Arthur Thomas Edward Capozzi Lupi | Alessandra Teresa Cignarella | Simona Frenda | Mirko Lai | Marco Antonio Stranisci | Alessandra Urbinati
Proceedings of the Ninth Italian Conference on Computational Linguistics (CLiC-it 2023)

2022

pdf bib abs

Inside the NLP community there is a considerable amount of language resources created, annotated and released every day with the aim of studying specific linguistic phenomena. Despite a variety of attempts in order to organize such resources has been carried on, a lack of systematic methods and of possible interoperability between resources are still present. Furthermore, when storing linguistic information, still nowadays, the most common practice is the concept of “gold standard”, which is in contrast with recent trends in NLP that aim at stressing the importance of different subjectivities and points of view when training machine learning and deep learning methods. In this paper we present O-Dang!: The Ontology of Dangerous Speech Messages, a systematic and interoperable Knowledge Graph (KG) for the collection of linguistic annotated data. O-Dang! is designed to gather and organize Italian datasets into a structured KG, according to the principles shared within the Linguistic Linked Open Data community. The ontology has also been designed to account a perspectivist approach, since it provides a model for encoding both gold standard and single-annotator labels in the KG. The paper is structured as follows. In Section 1 the motivations of our work are outlined. Section 2 describes the O-Dang! Ontology, that provides a common semantic model for the integration of datasets in the KG. The Ontology Population stage with information about corpora, users, and annotations is presented in Section 3. Finally, in Section 4 an analysis of offensiveness across corpora is provided as a first case study for the resource.

The paper introduces a new annotated French data set for Sentiment Analysis, which is a currently missing resource. It focuses on the collection from Twitter of data related to the socio-political debate about the reform of the bill for wedding in France. The design of the annotation scheme is described, which extends a polarity label set by making available tags for marking target semantic areas and figurative language devices. The annotation process is presented and the disagreement discussed, in particular, in the perspective of figurative language use and in that of the semantic oriented annotation, which are open challenges for NLP systems.

Mirko Lai

2025

2024

2023

2022

2021

2019

2018

2016

Co-authors

Venues