Carla Kam


2024

pdf bib
How Useful is Context, Actually? Comparing LLMs and Humans on Discourse Marker Prediction
Emily Sadlier-Brown | Millie Lou | Miikka Silfverberg | Carla Kam
Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics

This paper investigates the adverbial discourse particle actually. We compare LLM and human performance on cloze tests involving actually on examples sourced from the Providence Corpus of speech around children. We explore the impact of utterance context on cloze test performance. We find that context is always helpful, though the extent to which additional context is helpful, and what relative placement of context (i.e. before or after the masked word) is most helpful differs for individual models and humans. The best-performing LLM, GPT-4, narrowly outperforms humans. In an additional experiment, we explore cloze performance on synthetic LLM-generated examples, and find that several models vastly outperform humans.