Evaluating Large Language Models on Multiword Expressions in Multilingual and Code-Switched Contexts

Frances Adriana Laureano De Leon; Asim Abbas; Harish Tayyar Madabushi; Mark Lee

Evaluating Large Language Models on Multiword Expressions in Multilingual and Code-Switched Contexts

Frances Adriana Laureano De Leon, Asim Abbas, Harish Tayyar Madabushi, Mark Lee

Abstract

Multiword expressions, characterised by non-compositional meanings and syntactic irregularities, are an example of nuanced language. These expressions can be used literally or idiomatically, leading to significant changes in meaning. Although large language models perform well on many tasks, their ability to handle subtle linguistic phenomena remains unclear. This study examines how state-of-the-art models process the ambiguity of potentially idiomatic multiword expressions, particularly in less frequent contexts where memorisation is less likely to help. By evaluating models in Portuguese, Galician, and English, and introducing a new code-switched dataset and task, we show that large language models, despite their strengths, have difficulty handling nuanced language. In particular, we find that the latest models, including GPT-4, fail to outperform the xlm-roBERTa-base baselines in both detection and semantic tasks, with especially poor performance on the novel tasks we introduce, despite its similarity to existing tasks. Overall, our results demonstrate that multiword expressions, especially those that are ambiguous, continue to be a challenge to models. We provide open access to our datasets, prompts and model responses.

Anthology ID:: 2025.ranlp-1.75
Volume:: Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era
Month:: September
Year:: 2025
Address:: Varna, Bulgaria
Editors:: Galia Angelova, Maria Kunilovskaya, Marie Escribe, Ruslan Mitkov
Venue:: RANLP
SIG:
Publisher:: INCOMA Ltd., Shoumen, Bulgaria
Note:
Pages:: 644–653
Language:
URL:: https://aclanthology.org/2025.ranlp-1.75/
DOI:
Bibkey:
Cite (ACL):: Frances Adriana Laureano De Leon, Asim Abbas, Harish Tayyar Madabushi, and Mark Lee. 2025. Evaluating Large Language Models on Multiword Expressions in Multilingual and Code-Switched Contexts. In Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era, pages 644–653, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
Cite (Informal):: Evaluating Large Language Models on Multiword Expressions in Multilingual and Code-Switched Contexts (Laureano De Leon et al., RANLP 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.ranlp-1.75.pdf

PDF Cite Search Fix data