2025
pdf
bib
abs
A French Eye-Tracking Corpus of Original and Simplified Medical, Clinical, and General Texts - FETA
Oksana Ivchenko
|
Natalia Grabar
Proceedings of the First International Workshop on Gaze Data and Natural Language Processing
Eye tracking offers an objective window on real-time cognitive processing of information being read: longer fixations, more regressions, and wider pupil dilation reliably index linguistic difficulty. Yet, there is a paucity of the available corpora annotated with eye-tracking features. We introduce in this paper the FETA corpus – a French Eye-TrAcking corpus. It combines three types of texts (general, medical and clinical) in two versions (original and manually simplified). These texts are read by 46 participants, from which we collect eye-tracking data through dozens of eye-tracking features.
pdf
bib
abs
Where Patients Slow Down: Surprisal, Uncertainty, and Simplification in French Clinical Reading
Oksana Ivchenko
|
Alamgir Munir Qazi
|
Jamal Abdul Nasir
Proceedings of the First International Workshop on Gaze Data and Natural Language Processing
This eye-tracking study links language-model surprisal and contextual entropy to how 23 non-expert adults read French health texts. Participants read seven texts (clinical case, medical, general), each available in an Original and Simplified version. Surprisal and entropy were computed with eight autoregressive models (82M–8B parameters), and four complementary eye-tracking measures were analyzed. Surprisal correlates positively with early reading measures, peaking in the smallest GPT-2 models (r ≈ 0.26) and weakening with model size. Entropy shows the opposite pattern, with negative correlations strongest in the 7B-8B models (r ≈ −0.13), consistent with a skim-when-uncertain strategy. Surprisal effects are largest in Clinical Original passages and drop by ∼20% after simplification, whereas entropy effects are stable across domain and version. These findings expose a scaling paradox – where different model sizes are optimal for different cognitive signals – and suggest that French plain-language editing should focus on rewriting high-surprisal passages to reduce processing difficulty, and on avoiding high-entropy contexts for critical information.
pdf
bib
abs
L’Impact de la complexité textuelle sur le comportement de lecture : une analyse oculométrique et de la surprise des textes français
Oksana Ivchenko
|
Natalia Grabar
Actes des 32ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN), volume 1 : articles scientifiques originaux
L’Impact de la complexité textuelle sur le comportement de lecture : une analyse oculométrique et de la surprise des textes français Cette étude examine comment la complexité du texte affecte les processus de lecture à travers différents types de textes en combinant la méthodologie d’oculométrie avec l’analyse de la surprise. Nous avons créé un corpus en français avec des textes généraux, cliniques et médicaux, dans leurs versions originales et simplifiées, annotés avec des mesures oculométriques complètes provenant de 23 participants. La modélisation linéaire à effets mixtes révèle que la surprise prédit significativement les temps de lecture pour tous les types de textes, les textes médicaux montrant une sensibilité accrue aux mots inattendus. De façon importante, la simplification a des effets différentiels selon le type de texte : bien qu’elle ne réduit pas significativement les temps de lecture pour les textes cliniques, elle diminue considérablement les temps de lecture pour les textes médicaux. De plus, la simplification atténue l’effet de la surprise spécifiquement dans les textes médicaux, réduisant le coût cognitif associé au traitement des mots inattendus.
2024
pdf
bib
abs
Study of Medical Text Reading and Comprehension through Eye-Tracking Fixations
Oksana Ivchenko
|
Natalia Grabar
Proceedings of the First Workshop on Patient-Oriented Language Processing (CL4Health) @ LREC-COLING 2024
Reading plays a crucial role in cognitive processes, acting as the primary way in which people access and assimilate information. However, the ability to effectively comprehend and understand text is significantly influenced by various factors related to people and text types. We propose to study the reading easiness and comprehension of texts through the eye-tracking technology, which tracks gaze and records eye movement during reading. We concentrate on the study of eye-tracking measures related to fixations (average duration of fixations and number of fixations). The experiments are performed on several types of texts (clinical cases, encyclopedia articles related to the medical area, general-language texts, and simplified clinical cases). Eye-tracking measures are analysed quantitatively and qualitatively to draw the reading patterns and analyse how the reading differs across the text types.