Jamal Abdul Nasir
2025
Where Patients Slow Down: Surprisal, Uncertainty, and Simplification in French Clinical Reading
Oksana Ivchenko
|
Alamgir Munir Qazi
|
Jamal Abdul Nasir
Proceedings of the First International Workshop on Gaze Data and Natural Language Processing
This eye-tracking study links language-model surprisal and contextual entropy to how 23 non-expert adults read French health texts. Participants read seven texts (clinical case, medical, general), each available in an Original and Simplified version. Surprisal and entropy were computed with eight autoregressive models (82M–8B parameters), and four complementary eye-tracking measures were analyzed. Surprisal correlates positively with early reading measures, peaking in the smallest GPT-2 models (r ≈ 0.26) and weakening with model size. Entropy shows the opposite pattern, with negative correlations strongest in the 7B-8B models (r ≈ −0.13), consistent with a skim-when-uncertain strategy. Surprisal effects are largest in Clinical Original passages and drop by ∼20% after simplification, whereas entropy effects are stable across domain and version. These findings expose a scaling paradox – where different model sizes are optimal for different cognitive signals – and suggest that French plain-language editing should focus on rewriting high-surprisal passages to reduce processing difficulty, and on avoiding high-entropy contexts for critical information.