Sabine Ploux

2025

Semantic Analysis Experiments for French Citizens’ Contribution : Combinations of Language Models and Community Detection Algorithms
Sami Guembour | Catherine Dominguès | Sabine Ploux
Proceedings of the 16th International Conference on Computational Semantics

Following the Yellow Vest crisis that occurred in France in 2018, the French government launched the Grand Débat National, which gathered citizens’ contributions.This paper presents a semantic analysis of these contributions by segmenting them into sentences and identifying the topics addressed using clustering techniques. The study tests several combinations of French language models and community detection algorithms, aiming to identify the most effective pairing for grouping sentences based on thematic similarity. Performance is evaluated using the number of clusters generated and standard clustering metrics.Principal Component Analysis (PCA) is employed to assess the impact of dimensionality reduction on sentence embeddings and clustering quality. Cluster merging methods are also developed to reduce redundancy and improve the relevance of the identified topics.Finally, the results help refine semantic analysis and shed light on the main concerns expressed by citizens.

2011

pdf bib

Using Topic Salience and Connotational Drifts to Detect Candidates to Semantic Change
Armelle Boussidan | Sabine Ploux
Proceedings of the Ninth International Conference on Computational Semantics (IWCS 2011)

2010

pdf bib abs

The Semantic Atlas: an Interactive Model of Lexical Representation
Sabine Ploux | Armelle Boussidan | Hyungsuk Ji
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

In this paper we describe two geometrical models of meaning representation, the Semantic Atlas (SA) and the Automatic Contexonym Organizing Model (ACOM). The SA provides maps of meaning generated through correspondence factor analysis. The models can handle different types of word relations: synonymy in the SA and co-occurrence in ACOM. Their originality relies on an artifact called 'cliques' - a fine grained infra linguistic sub-unit of meaning. The SA is composed of several dictionaries and thesauri enhanced with a process of symmetrisation. It is currently available for French and English in monolingual versions as well as in a bilingual translation version. Other languages are under development and testing. ACOM deals with unannotated corpora. The models are used by research teams worldwide that investigate synonymy, translation processes, genre comparison, psycholinguistics and polysemy modeling. Both models can be consulted online via a flexible interface allowing for interactive navigation on http://dico.isc.cnrs.fr. This site is the most consulted address of the French National Center for Scientific Researchs domain (CNRS), one of the major research bodies in France. The international interest it has triggered led us to initiate the process of going open source. In the meantime, all our databases are freely available on request.

2003

pdf bib

A Model for Matching Semantic Maps between Languages (French/English, English/French)
Sabine Ploux | Hyungsuk Ji
Computational Linguistics, Volume 29, Number 2, June 2003

pdf bib abs

Lexical knowledge representation with contextonyms
Hyungsuk Ji | Sabine Ploux | Eric Wehrli
Proceedings of Machine Translation Summit IX: Papers

Inter-word associations like stagger - drunken, or intra-word sense divisions (e.g. write a diary vs. write an article) are difficult to compile using a traditional lexicographic approach. As an alternative, we present a model that reflects this kind of subtle lexical knowledge. Based on the minimal sense of a word (clique), the model (1) selects contextually related words (contexonyms) and (2) classifies them in a multi-dimensional semantic space. Trained on very large corpora, the model provides relevant, organized contexonyms that reflect the fine-grained connotations and contextual usage of the target word, as well as the distinct senses of homonyms and polysemous words. Further study on the neighbor effect showed that the model can handle the data sparseness problem.

Co-authors

Venues

Fix author