Claudia Wagner

2026

From Emotion to Expression: Theoretical Foundations and Resources for Fear Speech
Vigneshwaran Shankaran | Gabriella Lapesa | Claudia Wagner
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)

Few forces rival fear in their ability to mobilize societies, distort communication, and reshape collective behavior. In computational linguistics, fear is primarily studied as an emotion, but not as a distinct form of speech. Fear speech content is widespread and growing, and often outperforms hate-speech content in reach and engagement because it appears "civiler" and evades moderation. Yet the computational study of fear speech remains fragmented and under-resourced. This can be understood by recognizing that fear speech is a phenomenon shaped by contributions from multiple disciplines. In this paper, we bridge cross-disciplinary perspectives by comparing theories of fear from Psychology, Political science, Communication science, and Linguistics. Building on this, we review existing definitions. We follow up with a survey of datasets from related research areas and propose a taxonomy that consolidates different dimensions of fear for studying fear speech. By reviewing current datasets and defining core concepts, our work offers both theoretical and practical guidance for creating datasets and advancing fear speech research.

2025

pdf bib abs

Enhancing Training Data Quality through Influence Scores for Generalizable Classification: A Case Study on Sexism Detection
Rabiraj Bandyopadhyay | Dennis Assenmacher | Jose M. Alonso-Moral | Claudia Wagner
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics

The quality of training data is crucial for the performance of supervised machine learning models. In particular, poor annotation quality and spurious correlations between labels and features in text dataset can significantly degrade model generalization. This problem is especially pronounced in harmful language detection, where prior studies have revealed major deficiencies in existing datasets. In this work, we design and test data selection methods based on learnability measures to improve dataset quality. Using a sexism dataset with counterfactuals designed to avoid spurious correlations, we show that pruning with EL2N and PVI scores can lead to significant performance increases and outperforms submodular and random selection. Our analysis reveals that in presence of label imbalance models rely on dataset shortcuts; especially easy-to-classify sexist instances and hard-to-classify non-sexist instances contain shortcuts. Pruning these instances leads to performances increases. Pruning hard-to-classify instances is in general a promising strategy as well when shortcuts are not present.

2023

pdf bib abs

People Make Better Edits: Measuring the Efficacy of LLM-Generated Counterfactually Augmented Data for Harmful Language Detection
Indira Sen | Dennis Assenmacher | Mattia Samory | Isabelle Augenstein | Wil Aalst | Claudia Wagner
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

NLP models are used in a variety of critical social computing tasks, such as detecting sexist, racist, or otherwise hateful content. Therefore, it is imperative that these models are robust to spurious features. Past work has attempted to tackle such spurious features using training data augmentation, including Counterfactually Augmented Data (CADs). CADs introduce minimal changes to existing training data points and flip their labels; training on them may reduce model dependency on spurious features. However, manually generating CADs can be time-consuming and expensive. Hence in this work, we assess if this task can be automated using generative NLP models. We automatically generate CADs using Polyjuice, ChatGPT, and Flan-T5, and evaluate their usefulness in improving model robustness compared to manually-generated CADs. By testing both model performance on multiple out-of-domain test sets and individual data point efficacy, our results show that while manual CADs are still the most effective, CADs generated by ChatGPT come a close second. One key reason for the lower performance of automated methods is that the changes they introduce are often insufficient to flip the original label.

2022

pdf bib abs

Counterfactually Augmented Data and Unintended Bias: The Case of Sexism and Hate Speech Detection
Indira Sen | Mattia Samory | Claudia Wagner | Isabelle Augenstein
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Counterfactually Augmented Data (CAD) aims to improve out-of-domain generalizability, an indicator of model robustness. The improvement is credited to promoting core features of the construct over spurious artifacts that happen to correlate with it. Yet, over-relying on core features may lead to unintended model bias. Especially, construct-driven CAD—perturbations of core features—may induce models to ignore the context in which core features are used. Here, we test models for sexism and hate speech detection on challenging data: non-hate and non-sexist usage of identity and gendered terms. On these hard cases, models trained on CAD, especially construct-driven CAD, show higher false positive rates than models trained on the original, unperturbed data. Using a diverse set of CAD—construct-driven and construct-agnostic—reduces such unintended bias.

2021

pdf bib abs

How Does Counterfactually Augmented Data Impact Models for Social Computing Constructs?
Indira Sen | Mattia Samory | Fabian Flöck | Claudia Wagner | Isabelle Augenstein
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

As NLP models are increasingly deployed in socially situated settings such as online abusive content detection, it is crucial to ensure that these models are robust. One way of improving model robustness is to generate counterfactually augmented data (CAD) for training models that can better learn to distinguish between core features and data artifacts. While models trained on this type of data have shown promising out-of-domain generalizability, it is still unclear what the sources of such improvements are. We investigate the benefits of CAD for social NLP models by focusing on three social computing constructs — sentiment, sexism, and hate speech. Assessing the performance of models trained with and without CAD across different types of datasets, we find that while models trained on CAD show lower in-domain performance, they generalize better out-of-domain. We unpack this apparent discrepancy using machine explanations and find that CAD reduces model reliance on spurious features. Leveraging a novel typology of CAD to analyze their relationship with model performance, we find that CAD which acts on the construct directly or a diverse set of CAD leads to higher performance.

2020

pdf bib abs

On the Reliability and Validity of Detecting Approval of Political Actors in Tweets
Indira Sen | Fabian Flöck | Claudia Wagner
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Social media sites like Twitter possess the potential to complement surveys that measure political opinions and, more specifically, political actors’ approval. However, new challenges related to the reliability and validity of social-media-based estimates arise. Various sentiment analysis and stance detection methods have been developed and used in previous research to measure users’ political opinions based on their content on social media. In this work, we attempt to gauge the efficacy of untargeted sentiment, targeted sentiment, and stance detection methods in labeling various political actors’ approval by benchmarking them across several datasets. We also contrast the performance of these pretrained methods that can be used in an off-the-shelf (OTS) manner against a set of models trained on minimal custom data. We find that OTS methods have low generalizability on unseen and familiar targets, while low-resource custom models are more robust. Our work sheds light on the strengths and limitations of existing methods proposed for understanding politicians’ approval from tweets.

Claudia Wagner

2026

2025

2023

2022

2021

2020

2018

Co-authors

Venues