Adrian Luca
2025
Predicting Total Reading Time Using Romanian Eye-Tracking Data
Anamaria Hodivoianu
|
Oleksandra Kuvshynova
|
Filip Popovici
|
Adrian Luca
|
Sergiu Nisioi
Proceedings of the First International Workshop on Gaze Data and Natural Language Processing
This work introduces the first Romanian eye-tracking dataset for reading and investigates methods for predicting word-level total reading times. We develop and compare a range of models, from traditional machine learning using handcrafted linguistic features to fine-tuned Romanian BERT architectures, demonstrating strong correlations between predicted and observed reading times. Additionally, we propose a lexical simplification pipeline that leverages these TRT predictions to identify and substitute complex words, enhancing text readability. Our approach is integrated into an interactive web tool, illustrating the practical benefits of combining cognitive signals with NLP techniques for Romanian — a language with limited resources in this area.