Leran Zhang


2024

pdf bib
Eye-Tracking Features Masking Transformer Attention in Question-Answering Tasks
Leran Zhang | Nora Hollenstein
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Eye movement features are considered to be direct signals reflecting human attention distribution with a low cost to obtain, inspiring researchers to augment language models with eye-tracking (ET) data. In this study, we select first fixation duration (FFD) and total reading time (TRT) as the cognitive signals to guide Transformer attention in question-answering (QA) tasks. We design three different ET attention masks based on the two features, either collected from human reading events or generated by a gaze-predicting model. We augment BERT and ALBERT models with attention masks structured based on the ET data. We find that augmenting a model with ET data carries linguistic features complementing the information captured by the model. It improves the models’ performance but compromises the stability. Different Transformer models benefit from different types of ET attention masks, while ALBERT performs better than BERT. Moreover, ET data collected from real-life reading events has better model augmenting ability than the model-predicted data.