Lavinia Salicchi

2025

Not Every Metric is Equal: Cognitive Models for Predicting N400 and P600 Components During Reading Comprehension
Lavinia Salicchi | Yu-Yin Hsu
Proceedings of the 31st International Conference on Computational Linguistics

In recent years, numerous studies have sought to understand the cognitive dynamics underlying language processing by modeling reading times and ERP amplitudes using computational metrics like surprisal. In the present paper, we examine the predictive power of surprisal, entropy, and a novel metric based on semantic similarity for N400 and P600. Our experiments, conducted with Mandarin Chinese materials, revealed three key findings: 1) expectancy plays a primary role for N400; 2) P600 also reflects the cognitive effort required to evaluate linguistic input semantically; and 3) during the time window of interest, information uncertainty influences the language processing the most. Our findings show how computational metrics that capture distinct cognitive dimensions can effectively address psycholinguistic questions.

2024

pdf bib abs

What’s in a Name? Electrophysiological Differences in Processing Proper Nouns in Mandarin Chinese
Bernard A. J. Jap | Yu-Yin Hsu | Lavinia Salicchi | Yu Xi Li
Proceedings of the Workshop on Cognitive Aspects of the Lexicon @ LREC-COLING 2024

The current study examines how proper names and common nouns in Chinese are cognitively processed during sentence comprehension. EEG data was recorded when participants were presented with neutral contexts followed by either a proper name or a common noun. Proper names in Chinese often consist of characters that can function independently as words or be combined with other characters to form words, potentially benefiting from the semantic features carried by each character. Using cluster-based permutation tests, we found a larger N400 for common nouns when compared to proper names. Our results suggest that the semantics of characters do play a role in facilitating the processing of proper names. This is consistent with previous behavioral findings on noun processing in Chinese, indicating that common nouns require more cognitive resources to process than proper names. Moreover, our results suggest that proper names are processed differently between alphabetic languages and Chinese language.

2022

pdf bib abs

HkAmsters at CMCL 2022 Shared Task: Predicting Eye-Tracking Data from a Gradient Boosting Framework with Linguistic Features
Lavinia Salicchi | Rong Xiang | Yu-Yin Hsu
Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics

Eye movement data are used in psycholinguistic studies to infer information regarding cognitive processes during reading. In this paper, we describe our proposed method for the Shared Task of Cognitive Modeling and Computational Linguistics (CMCL) 2022 - Subtask 1, which involves data from multiple datasets on 6 languages. We compared different regression models using features of the target word and its previous word, and target word surprisal as regression features. Our final system, using a gradient boosting regressor, achieved the lowest mean absolute error (MAE), resulting in the best system of the competition.

2021

pdf bib abs

PIHKers at CMCL 2021 Shared Task: Cosine Similarity and Surprisal to Predict Human Reading Patterns.
Lavinia Salicchi | Alessandro Lenci
Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics

Eye-tracking psycholinguistic studies have revealed that context-word semantic coherence and predictability influence language processing. In this paper we show our approach to predict eye-tracking features from the ZuCo dataset for the shared task of the Cognitive Modeling and Computational Linguistics (CMCL2021) workshop. Using both cosine similarity and surprisal within a regression model, we significantly improved the baseline Mean Absolute Error computed among five eye-tracking features.

pdf bib abs

Looking for a Role for Word Embeddings in Eye-Tracking Features Prediction: Does Semantic Similarity Help?
Lavinia Salicchi | Alessandro Lenci | Emmanuele Chersoni
Proceedings of the 14th International Conference on Computational Semantics (IWCS)

Eye-tracking psycholinguistic studies have suggested that context-word semantic coherence and predictability influence language processing during the reading activity. In this study, we investigate the correlation between the cosine similarities computed with word embedding models (both static and contextualized) and eye-tracking data from two naturalistic reading corpora. We also studied the correlations of surprisal scores computed with three state-of-the-art language models. Our results show strong correlation for the scores computed with BERT and GloVe, suggesting that similarity can play an important role in modeling reading times.