EIFFEL: a novel benchmark to measure bias of English heavy training on French idiomatic expressions

Charlotte Noel; Nicholas Asher; Olivier Gouvert; Farah Benamara; Julie Hunter

EIFFEL: a novel benchmark to measure bias of English heavy training on French idiomatic expressions

Charlotte Noel, Nicholas Asher, Olivier Gouvert, Farah Benamara, Julie Hunter

Abstract

Mainstream multilingual LLMs are generally trained on a much higher proportion of English than multilingual data, raising questions about their ability to capture linguistic features particular to non-English languages or to capture information important to non-anglophone cultures. We add to a growing effort to increase multilingual sensitivity in LLMs by developing a benchmark, EIFFEL, testing mastery of French idiomatic expressions in context. We fully explain the methodology, which exploits input from native French speakers, to make it reproducible for other languages. We compare mainstream multilingual LLMs with French-focused LLMs both on standard LLM benchmarks and EIFFEL; EIFFEL brings out the benefits of higher proportions of French data and shows limitations of standard benchmarks for measuring multilingual competence.

Anthology ID:: 2026.acl-long.1326
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 28729–28747
Language:
URL:: https://aclanthology.org/2026.acl-long.1326/
DOI:
Bibkey:
Cite (ACL):: Charlotte Noel, Nicholas Asher, Olivier Gouvert, Farah Benamara, and Julie Hunter. 2026. EIFFEL: a novel benchmark to measure bias of English heavy training on French idiomatic expressions. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 28729–28747, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: EIFFEL: a novel benchmark to measure bias of English heavy training on French idiomatic expressions (Noel et al., ACL 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.acl-long.1326.pdf
Checklist:: 2026.acl-long.1326.checklist.pdf

PDF Cite Search Checklist Fix data