Douglas Summers-Stay

2024

Generating Converging Narratives for Games with Large Language Models
Douglas Summers-Stay | Clare R. Voss
Proceedings of the 10th Workshop on Games and Natural Language Processing @ LREC-COLING 2024

We explore methods of combining the probability distributions generated by two LLM prompts in order to generate a continuation that is appropriate for both prompts at once. This is a new capability that extends the possibilities for branching and rejoining narratives in games.

pdf bib abs

enter abstract here

2022

pdf bib abs

We evaluate an annotation schema for labeling logical fallacy types, originally developed for a crowd-sourcing annotation paradigm, now using an annotation paradigm of two trained linguist annotators. We apply the schema to a variety of different genres of text relating to the COVID-19 pandemic. Our linguist (as opposed to crowd-sourced) annotation of logical fallacies allows us to evaluate whether the annotation schema category labels are sufficiently clear and non-overlapping for both manual and, later, system assignment. We report inter-annotator agreement results over two annotation phases as well as a preliminary assessment of the corpus for training and testing a machine learning algorithm (Pattern-Exploiting Training) for fallacy detection and recognition. The agreement results and system performance underscore the challenging nature of this annotation task and suggest that the annotation schema and paradigm must be iteratively evaluated and refined in order to arrive at a set of annotation labels that can be reproduced by human annotators and, in turn, provide reliable training data for automatic detection and recognition systems.

2021

pdf bib abs

What Can a Generative Language Model Answer About a Passage?
Douglas Summers-Stay | Claire Bonial | Clare Voss
Proceedings of the 3rd Workshop on Machine Reading for Question Answering

Generative language models trained on large, diverse corpora can answer questions about a passage by generating the most likely continuation of the passage followed by a question/answer pair. However, accuracy rates vary depending on the type of question asked. In this paper we keep the passage fixed, and test with a wide variety of question types, exploring the strengths and weaknesses of the GPT-3 language model. We provide the passage and test questions as a challenge set for other language models.