Modelling Expectation-based and Memory-based Predictors of Human Reading Times with Syntax-guided Attention

Lukas Mielczarek, Timothée Bernard, Laura Kallmeyer, Katharina Spalek, Benoit Crabbé


Abstract
The correlation between reading times and surprisal is well known in psycholinguistics and is easy to observe. There is also a correlation between reading times and structural integration, which is, however, harder to detect (Gibson, 2000). This correlation has been studied using parsing models whose outputs are linked to reading times. In this paper, we study the relevance of memory-based effects in reading times and how to predict them using neural language models. We find that integration costs significantly improve surprisal-based reading time prediction. Inspired by Timkey and Linzen (2023), we design a small-scale autoregressive transformer language model in which attention heads are supervised by dependency relations. We compare this model to a standard variant by checking how well each model’s outputs correlate with human reading times and find that predicted attention scores can be effectively used as proxies for syntactic integration costs to predict self-paced reading times.
Anthology ID:
2025.brigap-1.7
Volume:
Proceedings of the Second Workshop on the Bridges and Gaps between Formal and Computational Linguistics (BriGap-2)
Month:
September
Year:
2025
Address:
Düsseldorf, Germany
Editors:
Timothée Bernard, Timothee Mickus
Venues:
BriGap | WS
SIG:
SIGSEM
Publisher:
Association for Computational Linguistics
Note:
Pages:
52–71
Language:
URL:
https://aclanthology.org/2025.brigap-1.7/
DOI:
Bibkey:
Cite (ACL):
Lukas Mielczarek, Timothée Bernard, Laura Kallmeyer, Katharina Spalek, and Benoit Crabbé. 2025. Modelling Expectation-based and Memory-based Predictors of Human Reading Times with Syntax-guided Attention. In Proceedings of the Second Workshop on the Bridges and Gaps between Formal and Computational Linguistics (BriGap-2), pages 52–71, Düsseldorf, Germany. Association for Computational Linguistics.
Cite (Informal):
Modelling Expectation-based and Memory-based Predictors of Human Reading Times with Syntax-guided Attention (Mielczarek et al., BriGap 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.brigap-1.7.pdf