Martin Wessel

2025

LLM-based Adversarial Dataset Augmentation for Automatic Media Bias Detection
Martin Wessel
Proceedings of the 9th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LaTeCH-CLfL 2025)

This study presents BiasAdapt, a novel data augmentation strategy designed to enhance the robustness of automatic media bias detection models. Leveraging the BABE dataset, BiasAdapt uses a generative language model to identify bias-indicative keywords and replace them with alternatives from opposing categories, thus creating adversarial examples that preserve the original bias labels. The contributions of this work are twofold: it proposes a scalable method for augmenting bias datasets with adversarial examples while preserving labels, and it publicly releases an augmented adversarial media bias dataset.Training on BiasAdapt reduces the reliance on spurious cues in four of the six evaluated media bias categories.

2024

pdf bib abs

Beyond the Surface: Spurious Cues in Automatic Media Bias Detection
Martin Wessel | Tomáš Horych
Proceedings of the Fourth Workshop on Language Technology for Equality, Diversity, Inclusion

This study investigates the robustness and generalization of transformer-based models for automatic media bias detection. We explore the behavior of current bias classifiers by analyzing feature attributions and stress-testing with adversarial datasets. The findings reveal a disproportionate focus on rare but strongly connotated words, suggesting a rather superficial understanding of linguistic bias and challenges in contextual interpretation. This problem is further highlighted by inconsistent bias assessment when stress-tested with different entities and minorities. Enhancing automatic media bias detection models is critical to improving inclusivity in media, ensuring balanced and fair representation of diverse perspectives.

Co-authors

Tomáš Horych 1

Venues

Fix author