Are Frequent Phrases Directly Retrieved like Idioms? An Investigation with Self-Paced Reading and Language Models

Giulia Rambelli, Emmanuele Chersoni, Marco S. G. Senaldi, Philippe Blache, Alessandro Lenci


Abstract
An open question in language comprehension studies is whether non-compositional multiword expressions like idioms and compositional-but-frequent word sequences are processed differently. Are the latter constructed online, or are instead directly retrieved from the lexicon, with a degree of entrenchment depending on their frequency? In this paper, we address this question with two different methodologies. First, we set up a self-paced reading experiment comparing human reading times for idioms and both highfrequency and low-frequency compositional word sequences. Then, we ran the same experiment using the Surprisal metrics computed with Neural Language Models (NLMs). Our results provide evidence that idiomatic and high-frequency compositional expressions are processed similarly by both humans and NLMs. Additional experiments were run to test the possible factors that could affect the NLMs’ performance.
Anthology ID:
2023.mwe-1.13
Volume:
Proceedings of the 19th Workshop on Multiword Expressions (MWE 2023)
Month:
May
Year:
2023
Address:
Dubrovnik, Croatia
Editors:
Archna Bhatia, Kilian Evang, Marcos Garcia, Voula Giouli, Lifeng Han, Shiva Taslimipoor
Venue:
MWE
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
87–98
Language:
URL:
https://aclanthology.org/2023.mwe-1.13
DOI:
10.18653/v1/2023.mwe-1.13
Bibkey:
Cite (ACL):
Giulia Rambelli, Emmanuele Chersoni, Marco S. G. Senaldi, Philippe Blache, and Alessandro Lenci. 2023. Are Frequent Phrases Directly Retrieved like Idioms? An Investigation with Self-Paced Reading and Language Models. In Proceedings of the 19th Workshop on Multiword Expressions (MWE 2023), pages 87–98, Dubrovnik, Croatia. Association for Computational Linguistics.
Cite (Informal):
Are Frequent Phrases Directly Retrieved like Idioms? An Investigation with Self-Paced Reading and Language Models (Rambelli et al., MWE 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.mwe-1.13.pdf
Video:
 https://aclanthology.org/2023.mwe-1.13.mp4