Predicting Total Reading Time Using Romanian Eye-Tracking Data

Anamaria Hodivoianu, Oleksandra Kuvshynova, Filip Popovici, Adrian Luca, Sergiu Nisioi


Abstract
This work introduces the first Romanian eye-tracking dataset for reading and investigates methods for predicting word-level total reading times. We develop and compare a range of models, from traditional machine learning using handcrafted linguistic features to fine-tuned Romanian BERT architectures, demonstrating strong correlations between predicted and observed reading times. Additionally, we propose a lexical simplification pipeline that leverages these TRT predictions to identify and substitute complex words, enhancing text readability. Our approach is integrated into an interactive web tool, illustrating the practical benefits of combining cognitive signals with NLP techniques for Romanian — a language with limited resources in this area.
Anthology ID:
2025.gaze4nlp-1.9
Volume:
Proceedings of the First International Workshop on Gaze Data and Natural Language Processing
Month:
September
Year:
2025
Address:
Varna, Bulgaria
Editors:
Cengiz Acarturk, Jamal Nasir, Burcu Can, Cagrı Coltekin
Venues:
Gaze4NLP | WS
SIG:
Publisher:
INCOMA Ltd., Shoumen, BULGARIA
Note:
Pages:
71–75
Language:
URL:
https://aclanthology.org/2025.gaze4nlp-1.9/
DOI:
Bibkey:
Cite (ACL):
Anamaria Hodivoianu, Oleksandra Kuvshynova, Filip Popovici, Adrian Luca, and Sergiu Nisioi. 2025. Predicting Total Reading Time Using Romanian Eye-Tracking Data. In Proceedings of the First International Workshop on Gaze Data and Natural Language Processing, pages 71–75, Varna, Bulgaria. INCOMA Ltd., Shoumen, BULGARIA.
Cite (Informal):
Predicting Total Reading Time Using Romanian Eye-Tracking Data (Hodivoianu et al., Gaze4NLP 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.gaze4nlp-1.9.pdf