Bastien Liétard

Also published as: Bastien Lietard


2025

pdf bib
Vers les Sens et Au-delà : Induire des Concepts Sémantiques Avec des Modèles de Langue Contextuels
Bastien Liétard | Pascal Denis | Mikaela Keller
Actes des 32ème Conférence sur le Traitement Automatique des Langues Naturelles (TALN), volume 2 : traductions d'articles publiés

La polysémie et la synonymie sont deux facettes cruciales et interdépendantes de l’ambiguïté lexicosémantique, mais elles sont souvent considérées indépendamment dans les problèmes pratiques en TAL. Dans cet article, nous introduisons l’induction de concepts, une tâche non-supervisée consistant à apprendre un partitionnement diffus de mots définissant un ensemble de concepts directement à partir de données. Cette tâche généralise l’induction du sens des mots (via l’appartenance d’un mot à de multiples groupes). Nous proposons une approche à deux niveaux pour l’induction de concepts, avec une vue centrée sur les lemmes et une vue globale du lexique. Nous évaluons le regroupement obtenu sur les données annotées de SemCor et obtenons de bonnes performances (BCubed-F1 supérieur à 0,60). Nous constatons que les deux niveaux sont mutuellement bénéfiques pour induire les concepts et les sens. Enfin, nous créons des plongements dits « statiques » représentant nos concepts induits et obtenons des performances compétitives par rapport à l’état de l’art en Word-in-Context.

2024

pdf bib
To Word Senses and Beyond: Inducing Concepts with Contextualized Language Models
Bastien Liétard | Pascal Denis | Mikaela Keller
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Polysemy and synonymy are two crucial interrelated facets of lexicalambiguity. While both phenomena are widely documented in lexical resources and have been studied extensively in NLP,leading to dedicated systems, they are often being consideredindependently in practictal problems. While many tasks dealing with polysemy (e.g. Word SenseDisambiguiation or Induction) highlight the role of word’s senses,the study of synonymy is rooted in the study of concepts, i.e. meaningsshared across the lexicon. In this paper, we introduce ConceptInduction, the unsupervised task of learning a soft clustering amongwords that defines a set of concepts directly from data. This taskgeneralizes Word Sense Induction. We propose a bi-levelapproach to Concept Induction that leverages both a locallemma-centric view and a global cross-lexicon view to induceconcepts. We evaluate the obtained clustering on SemCor’s annotateddata and obtain good performance (BCubed F1 above0.60). We find that the local and the global levels are mutuallybeneficial to induce concepts and also senses in our setting. Finally,we create static embeddings representing our induced concepts and usethem on the Word-in-Context task, obtaining competitive performancewith the State-of-the-Art.

pdf bib
Towards an Onomasiological Study of Lexical Semantic Change Through the Induction of Concepts
Bastien Liétard | Mikaela Keller | Pascal Denis
Proceedings of the 5th Workshop on Computational Approaches to Historical Language Change

2023

pdf bib
A Tale of Two Laws of Semantic Change: Predicting Synonym Changes with Distributional Semantic Models
Bastien Lietard | Mikaela Keller | Pascal Denis
Proceedings of the 12th Joint Conference on Lexical and Computational Semantics (*SEM 2023)

Lexical Semantic Change is the study of how the meaning of words evolves through time. Another related question is whether and how lexical relations over pairs of words, such as synonymy, change over time. There are currently two competing, apparently opposite hypotheses in the historical linguistic literature regarding how synonymous words evolve: the Law of Differentiation (LD) argues that synonyms tend to take on different meanings over time, whereas the Law of Parallel Change (LPC) claims that synonyms tend to undergo the same semantic change and therefore remain synonyms. So far, there has been little research using distributional models to assess to what extent these laws apply on historical corpora. In this work, we take a first step toward detecting whether LD or LPC operates for given word pairs. After recasting the problem into a more tractable task, we combine two linguistic resources to propose the first complete evaluation framework on this problem and provide empirical evidence in favor of a dominance of LD. We then propose various computational approaches to the problem using Distributional Semantic Models and grounded in recent literature on Lexical Semantic Change detection. Our best approaches achieve a balanced accuracy above 0.6 on our dataset. We discuss challenges still faced by these approaches, such as polysemy or the potential confusion between synonymy and hypernymy.

2021

pdf bib
Do Language Models Know the Way to Rome?
Bastien Liétard | Mostafa Abdou | Anders Søgaard
Proceedings of the Fourth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP

The global geometry of language models is important for a range of applications, but language model probes tend to evaluate rather local relations, for which ground truths are easily obtained. In this paper we exploit the fact that in geography, ground truths are available beyond local relations. In a series of experiments, we evaluate the extent to which language model representations of city and country names are isomorphic to real-world geography, e.g., if you tell a language model where Paris and Berlin are, does it know the way to Rome? We find that language models generally encode limited geographic information, but with larger models performing the best, suggesting that geographic knowledge can be induced from higher-order co-occurrence statistics.