Yves Lepage

2025

Q&A-LF : A French Question-Answering Benchmark for Measuring Fine-Grained Lexical Knowledge
Alexander Petrov | Alessandra Thais Mancas | Viviane Binet | Antoine Venant | Francois Lareau | Yves Lepage | Phillippe Langlais
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era

We introduce Q&A-LF, a French, question-answering benchmark designed to assess the extent to which large language models capture fine-grained lexical knowledge. We investigate the ability of ChatGPT-4o mini, Qwen2.5-14B, Llama3.0-8B, and Llama3.1-8B to answer questions based on lexical functions from Meaning-Text Theory. Using various prompting setups with different levels of examples and context, we find that Qwen and ChatGPT generally outperform Llama models, achieving up to 70% accuracy, while Llama models reach just above 60%. We identify LFs that are particularly easy or especially challenging for the models. We further investigate whether providing sentence-level context and one-shot prompting improve performance, especially on semantically complex functions.

pdf bib abs

ALF : Un jeu de données d’analogies françaises à grain fin pour l’évaluation de la connaissance lexicale des grands modèles de langue
Alexander Petrov | Antoine Venant | François Lareau | Yves Lepage | Philippe Langlais
Actes des 32ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN), volume 1 : articles scientifiques originaux

La révolution apportée par les grands modèles de langue (LLM) provient de l’étonnante fluidité des textes qu’ils génèrent. Cette fluidité soulève une question scientifique essentielle : quelle quantité de connaissance lexicale les LLM capturent-ils réellement afin de produire un langage aussi fluide? Pour y répondre, nous présentons ALF, un jeu de données analogiqes librement accessible et doté de riches informations lexicographiques fondées sur la théorie Sens-Texte. Il comprend 2600 analogies lexicales à grain fin avec lesquelles nous évaluons la capacité lexicale de quatre LLM standards : ChatGPT-4o mini ,Llama3.0-8B ,Llama3.1-8B etQwen2.5-14B . En moyenne, ChatGPT et la série Llama obtiennent une précision aux environs de 55%, tandis que Qwen est juste en dessous du seuil des 60%, ce qui montre qu’ALF pose un défi considérable. Nous identifions en outre certains types d’analogies et de méthodes d’invite qui révèlent des disparités de performance.

pdf bib abs

AnaScore: Understanding Semantic Parallelism in Proportional Analogies
Liyan Wang | Haotong Wang | Yves Lepage
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

Formulaic criteria for proportional analogies, which capture relational mappings between two ratios of terms, are mainly confined to the formal level. As analogy datasets grow more complex, especially in evaluating the cognitive abilities of Large Language Models (LLMs), assessing parallelism in them becomes increasingly challenging and often requires human annotation. In this work, we propose AnaScore, an automatic metric for evaluating the strength of semantic parallelism in sentence analogies. AnaScore systematically provides formalized explanations for shared relational patterns at the level of conceptual knowledge. We apply AnaScore to annotate several existing datasets, considering different directions of the relations, and uncover artifacts in data construction. Our experiments with various LLMs demonstrate the efficacy of the AnaScore metric in capturing the inherent quality of analogical relationships, showing a positive correlation between analogy quality and model performance. Thanks to this metric, we clearly demonstrate that formally explainable examples are more beneficial for analogical reasoning, while ambiguous analogies with no clear criterion tend to hinder inference.

Yves Lepage

2025

2024

2023

2022

2021

2020

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2001

2000

1998

1996

1994

Co-authors

Venues