Michele Joshua Maggini
2026
PartisanLens: A Multilingual Dataset of Hyperpartisan and Conspiratorial Immigration Narratives in European Media
Michele Joshua Maggini | Paloma Piot | Anxo Pérez | Erik Bran Marino | Lúa Santamaría Montesinos | Ana Lisboa Cotovio | Marta Vázquez Abuín | Javier Parapar | Pablo Gamallo
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Michele Joshua Maggini | Paloma Piot | Anxo Pérez | Erik Bran Marino | Lúa Santamaría Montesinos | Ana Lisboa Cotovio | Marta Vázquez Abuín | Javier Parapar | Pablo Gamallo
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Detecting hyperpartisan narratives and Population Replacement Conspiracy Theories (PRCT) is essential to addressing the spread of misinformation. These complex narratives pose a significant threat, as hyperpartisanship drives political polarisation and institutional distrust, while PRCTs directly motivate real-world extremist violence, making their identification critical for social cohesion and public safety. However, existing resources are scarce, predominantly English-centric, and often analyse hyperpartisanship, stance, and rhetorical bias in isolation rather than as interrelated aspects of political discourse. To bridge this gap, we introduce PartisanLens, the first multilingual dataset of 1617 hyperpartisan news headlines in Spanish, Italian, and Portuguese, annotated in multiple political discourse aspects. We first evaluate the classification performance of widely used Large Language Models (LLMs) on this dataset, establishing robust baselines for the classification of hyperpartisan and PRCT narratives. In addition, we assess the viability of using LLMs as automatic annotators for this task, analysing their ability to approximate human annotation. Results highlight both their potential and current limitations. Next, moving beyond standard judgments, we explore whether LLMs can emulate human annotation patterns by conditioning them on socio-economic and ideological profiles that simulate annotator perspectives. At last, we provide our resources and evaluation; PartisanLens supports future research on detecting partisan and conspiratorial narratives in European contexts.
2025
Detecting Hyperpartisanship and Rhetorical Bias in Climate Journalism: A Sentence-Level Italian Dataset
Michele Joshua Maggini | Davide Bassi | Pablo Gamallo
Proceedings of the 2nd Workshop on Natural Language Processing Meets Climate Change (ClimateNLP 2025)
Michele Joshua Maggini | Davide Bassi | Pablo Gamallo
Proceedings of the 2nd Workshop on Natural Language Processing Meets Climate Change (ClimateNLP 2025)
We present the first Italian dataset for joint hyperpartisan and rhetorical bias detection in climate change discourse. The dataset comprises 48 articles (1,010 sentences) from far-right media outlets, annotated at sentence level for both binary hyperpartisan classification and a fine-grained taxonomy of 17 rhetorical biases. Our annotation scheme achieves a Cohen’s kappa agreement of 0.63 on the gold test set (173 sentences), demonstrating the complexity and reliability of the task. We conduct extensive analysis revealing significant correlations between hyperpartisan content and specific rhetorical techniques, particularly in climate change, Euroscepticism, and green policy coverage. To the best of our knowledge, we are the first to tackle hyperpartisan detection related to logical fallacies. Indeed, we studied their correlation. Moreover, up to our knowledge no previous work focused on hyperpartisan at sentence level. Our experiments with state-of-the-art language models (GPT-4o-mini) and Italian BERTbase models establish strong baselines for both tasks, while highlighting the challenges in detecting subtle manipulation strategies applied with rhetorical biases. To ensure reproducibility while addressing copyright concerns, we release article URLs, article id and paragraph’s number alongside comprehensive annotation guidelines. This resource advances research in cross-lingual propaganda detection and provides insights into the rhetorical strategies employed in Italian climate change discourse. We provide the code and the dataset to reproduce our results: https://anonymous.4open.science/r/Climate_HP-RB-D5EF/README.md