On the alignment of LM language generation and human language comprehension

Lena Sophia Bolliger, Patrick Haller, Lena Ann Jäger


Abstract
Previous research on the predictive power (PP) of surprisal and entropy has focused on determining which language models (LMs) generate estimates with the highest PP on reading times, and examining for which populations the PP is strongest. In this study, we leverage eye movement data on texts that were generated using a range of decoding strategies with different LMs. We then extract the transition scores that reflect the models’ production rather than comprehension effort. This allows us to investigate the alignment of LM language production and human language comprehension. Our findings reveal that there are differences in the strength of the alignment between reading behavior and certain LM decoding strategies and that this alignment further reflects different stages of language understanding (early, late, or global processes). Although we find lower PP of transition-based measures compared to surprisal and entropy for most decoding strategies, our results provide valuable insights into which decoding strategies impose less processing effort for readers. Our code is available via https://github.com/DiLi-Lab/LM-human-alignment.
Anthology ID:
2024.blackboxnlp-1.14
Volume:
Proceedings of the 7th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP
Month:
November
Year:
2024
Address:
Miami, Florida, US
Editors:
Yonatan Belinkov, Najoung Kim, Jaap Jumelet, Hosein Mohebbi, Aaron Mueller, Hanjie Chen
Venue:
BlackboxNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
217–231
Language:
URL:
https://aclanthology.org/2024.blackboxnlp-1.14
DOI:
Bibkey:
Cite (ACL):
Lena Sophia Bolliger, Patrick Haller, and Lena Ann Jäger. 2024. On the alignment of LM language generation and human language comprehension. In Proceedings of the 7th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP, pages 217–231, Miami, Florida, US. Association for Computational Linguistics.
Cite (Informal):
On the alignment of LM language generation and human language comprehension (Bolliger et al., BlackboxNLP 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.blackboxnlp-1.14.pdf