Milena Belosevic

2026

Invisible Speakers? Gender Disparity in German AI Discourse and Its Reflection in Language Models
Milena Belosevic
Proceedings of the 10th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature 2026

This paper investigates how language models (LMs) reproduce the existing gender disparity found in German media discourse about artificial intelligence (AI). Building on a human-annotated corpus of quotations from German media discourse on AI, we first quantify the frequency with which male and female speakers are directly cited across domains and speaker roles. We then train LLäMmlein (Pfister et al., 2025), a state-of-the-art German-only language model, GBERT, and a logistic regression model using only the quoted text as input and without providing any gender cues to classify the quotation as originating from a male or female speaker. By comparing model predictions with corpus-based gold labels, we find that male voices dominate both the corpus and the model predictions. Balancing the data mitigates but does not fully eliminate this disparity, indicating that the strong male-default tendency of transformer models cannot be explained by corpus skew alone, but also by their priors from pretraining. The study contributes to the interpretability of language models’ output for DH-related tasks, adaptation of NLP tools to domain-specific humanities corpora, and knowledge modelling in the humanities.

2025

pdf bib

LLM-based Classification of Grounding Acts in German
Milena Belosevic | Hendrik Buschmeier
Proceedings of the 21st Conference on Natural Language Processing (KONVENS 2025): Long and Short Papers

pdf bib abs

User-Centric Design Paradigms for Trust and Control in Human-LLM-Interactions: A Survey
Milena Belosevic
Proceedings of the Fourth Workshop on Bridging Human-Computer Interaction and Natural Language Processing (HCI+NLP)

As LLMs become widespread, trust in their behavior becomes increasingly important. For NLP research, it is crucial to ensure that not only AI designers and developers, but also end users, are enabled to control the properties of trustworthy LLMs, such as transparency, privacy, or accuracy. However, involving end users in this process remains a practical challenge. Based on a design-centered survey of methods developed in recent papers from HCI and NLP venues, this paper proposes seven design paradigms that can be integrated in NLP research to enhance end-user control over the trustworthiness of LLMs. We discuss design gaps and challenges of applying these paradigms in NLP and propose future research directions.

pdf bib abs

Tore-Klose: Record Scorer, Goal Hunter, Machine? Human Association Norms for German Personal Name Compounds
Annerose Eichel | Tana Deeg | Andre Blessing | Milena Belosevic | Sabine Arndt-Lappe | Sabine Schulte Im Walde
Proceedings of the 2nd Workshop on Analogical Abstraction in Cognition, Perception, and Language (Analogy-Angle II)

We present a collection of human association norms to German personal name compounds (PNCs) such as “Tore-Klose” (goal-Klose) and corresponding full names (Miroslav Klose), thus providing a novel testbed for PNC evaluation, i.e., analogical vs. contrastive positive vs. negative perception effects. The associations are obtained in an online experiment with German native speakers, analyzed regarding our novel intertwined PNC–person association setup, and accompanied by an LLM synthetic generation approach for augmentation.

2024

pdf bib abs

Willkommens-Merkel, Chaos-Johnson, and Tore-Klose: Modeling the Evaluative Meaning of German Personal Name Compounds
Annerose Eichel | Tana Deeg | Andre Blessing | Milena Belosevic | Sabine Arndt-Lappe | Sabine Schulte im Walde
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

We present a comprehensive computational study of the under-investigated phenomenon of personal name compounds (PNCs) in German such as Willkommens-Merkel (‘Welcome-Merkel’). Prevalent in news, social media, and political discourse, PNCs are hypothesized to exhibit an evaluative function that is reflected in a more positive or negative perception as compared to the respective personal full name (such as Angela Merkel). We model 321 PNCs and their corresponding full names at discourse level, and show that PNCs bear an evaluative nature that can be captured through a variety of computational methods. Specifically, we assess through valence information whether a PNC is more positively or negatively evaluative than the person’s name, by applying and comparing two approaches using (i) valence norms and (ii) pre-trained language models (PLMs). We further enrich our data with personal, domain-specific, and extra-linguistic information and perform a range of regression analyses revealing that factors including compound and modifier valence, domain, and political party membership influence how a PNC is evaluated.

Co-authors

Hendrik Buschmeier 1

Venues

LaTeCH-CLfL1

LREC1

Fix author