Asteria Kaeberlein

2025

Reversing Causal Assumptions: Explainability in Online Sports Dialogues
Asteria Kaeberlein | Malihe Alikhani
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era

Prior XAI research often assumes inputs must be “causes” and outputs must be “effects”, severely limiting applicability to analyzing behaviors that emerge as reactions or consequences. Many linguistic tasks, such as dialogues and conversations, involve such behaviors. To address this, we propose that the assumed causality from inputs to outputs can be reversed and still remain valid by using outputs that cause changes in features. We show how this enables analysis of complex feature sets through simpler metrics, propose a framework that is generalizable to most linguistic tasks, and highlight best practices for applying our framework. By training a predictive model from complex effects to simple causes, we apply feature attributions to estimate how the inputs change with the outputs. We demonstrate an application of this by studying sports fans’ comments made during a game and compare those comments to a simpler metric, win probability. We also expand on a prior study of intergroup bias, demonstrating how our framework can uncover behaviors that other XAI methods may overlook. We discuss the implications of these findings for advancing interpretability in computational linguistics and improving data-driven-decision-making in social contexts.

pdf bib abs

How LLMs Influence Perceived Bias in Journalism
Asteria Kaeberlein | Malihe Alikhani
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era

As the use of generative AI tools in journalistic writing becomes more common, reporters have expressed growing concerns about how it may introduce bias to their works. This paper investigates how the integration of large language models (LLMs) into journalistic writing, both as editors and independent ‘authors’, can alter user perception of bias in media. We show novel insights into how human perception of media bias differs from automatic evaluations. Through human evaluations comparing original human-authored articles, AI-edited articles, and AI-generated articles, we show that while LLMs rarely introduce new bias and often trend towards neutrality, this supposedly ‘safe’ behavior can have harmful impacts. This is most observable in sensitive human rights contexts, where the AI’s neutral and measured tone can reduce the representation of relevant voices and present misinformation in a more convincing manner. Furthermore, we demonstrate the existence of previously unidentified patterns that existing automated bias detection methods fail to accurately capture. We underscore the critical need for human-centered evaluation frameworks in AI-assisted journalism by introducing human evaluations and contrasting against a state-of-the-art automated bias detection system.

Co-authors

Malihe Alikhani 2

Venues

ranlp2

Fix author