Idiom Understanding as a Tool to Measure the Dialect Gap

David Beauchemin; Yan Tremblay; Mohamed Amine Youssef; Richard Khoury

Idiom Understanding as a Tool to Measure the Dialect Gap

David Beauchemin, Yan Tremblay, Mohamed Amine Youssef, Richard Khoury

Abstract

The tasks of idiom understanding and dialect understanding are both well-established benchmarks in natural language processing. In this paper, we propose combining them, and using regional idioms as a test of dialect understanding. Towards this end, we propose three new benchmark datasets for the Quebec dialect of French: QFrCoRE, which contains 4,633 instances of idiomatic phrases, and QFrCoRT, which comprises 171 regional instances of idiomatic words, and a new benchmark for French Metropolitan expressions, MFrCoE, which comprises 4,938 phrases.We explain how to construct these corpora, so that our methodology can be replicated for other dialects. Our experiments with 111 LLMs reveal a critical disparity in dialectal competence: while models perform well on French Metropolitan, 65.77% of them perform significantly worse on Quebec idioms, with only 9.0% favoring the regional dialect. These results confirm that our benchmarks are a reliable tool for quantifying the dialect gap and that prestige-language proficiency does not guarantee regional dialect understanding.

Anthology ID:: 2026.findings-acl.24
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 505–522
Language:
URL:: https://aclanthology.org/2026.findings-acl.24/
DOI:
Bibkey:
Cite (ACL):: David Beauchemin, Yan Tremblay, Mohamed Amine Youssef, and Richard Khoury. 2026. Idiom Understanding as a Tool to Measure the Dialect Gap. In Findings of the Association for Computational Linguistics: ACL 2026, pages 505–522, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Idiom Understanding as a Tool to Measure the Dialect Gap (Beauchemin et al., Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-acl.24.pdf
Checklist:: 2026.findings-acl.24.checklist.pdf

PDF Cite Search Checklist Fix data