Maria Antoniak


pdf bib
Heroes, Villains, and Victims, and GPT-3: Automated Extraction of Character Roles Without Training Data
Dominik Stammbach | Maria Antoniak | Elliott Ash
Proceedings of the 4th Workshop of Narrative Understanding (WNU2022)

This paper shows how to use large-scale pretrained language models to extract character roles from narrative texts without domain-specific training data. Queried with a zero-shot question-answering prompt, GPT-3 can identify the hero, villain, and victim in diverse domains: newspaper articles, movie plot summaries, and political speeches.

pdf bib
Narrative Datasets through the Lenses of NLP and HCI
Sharifa Sultana | Renwen Zhang | Hajin Lim | Maria Antoniak
Proceedings of the Second Workshop on Bridging Human--Computer Interaction and Natural Language Processing

In this short paper, we compare existing value systems and approaches in NLP and HCI for collecting narrative data. Building on these parallel discussions, we shed light on the challenges facing some popular NLP dataset types, which we discuss these in relation to widely-used narrative-based HCI research methods; and we highlight points where NLP methods can broaden qualitative narrative studies. In particular, we point towards contextuality, positionality, dataset size, and open research design as central points of difference and windows for collaboration when studying narratives. Through the use case of narratives, this work contributes to a larger conversation regarding the possibilities for bridging NLP and HCI through speculative mixed-methods.


pdf bib
Bad Seeds: Evaluating Lexical Methods for Bias Measurement
Maria Antoniak | David Mimno
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

A common factor in bias measurement methods is the use of hand-curated seed lexicons, but there remains little guidance for their selection. We gather seeds used in prior work, documenting their common sources and rationales, and in case studies of three English-language corpora, we enumerate the different types of social biases and linguistic features that, once encoded in the seeds, can affect subsequent bias measurements. Seeds developed in one context are often re-used in other contexts, but documentation and evaluation remain necessary precursors to relying on seeds for sensitive measurements.

pdf bib
‘Tecnologica cosa’: Modeling Storyteller Personalities in Boccaccio’s ‘Decameron’
A. Cooper | Maria Antoniak | Christopher De Sa | Marilyn Migiel | David Mimno
Proceedings of the 5th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature

We explore Boccaccio’s Decameron to see how digital humanities tools can be used for tasks that have limited data in a language no longer in contemporary use: medieval Italian. We focus our analysis on the question: Do the different storytellers in the text exhibit distinct personalities? To answer this question, we curate and release a dataset based on the authoritative edition of the text. We use supervised classification methods to predict storytellers based on the stories they tell, confirming the difficulty of the task, and demonstrate that topic modeling can extract thematic storyteller “profiles.”


pdf bib
Evaluating the Stability of Embedding-based Word Similarities
Maria Antoniak | David Mimno
Transactions of the Association for Computational Linguistics, Volume 6

Word embeddings are increasingly being used as a tool to study word associations in specific corpora. However, it is unclear whether such embeddings reflect enduring properties of language or if they are sensitive to inconsequential variations in the source documents. We find that nearest-neighbor distances are highly sensitive to small changes in the training corpus for a variety of algorithms. For all methods, including specific documents in the training set can result in substantial variations. We show that these effects are more prominent for smaller training corpora. We recommend that users never rely on single embedding models for distance calculations, but rather average over multiple bootstrap samples, especially for small corpora.