Sébastien Fournier

Also published as: Sebastien Fournier

2025

pdf bib abs
Représentations conditionnelles entité-centrées pour le raisonnement multi-saut dans les systèmes de question-réponse multi-document
Romain Bourgeois | Adrian Chifu | Sébastien Fournier
Actes de l'atelier Accès à l’information basé sur le dialogue et grands modèles de langage 2025 (DIAG-LLM)

Les systèmes de question-réponse multi-document (MD-QA) nécessitent un raisonnement multi-saut fondé sur des informations éparses à travers plusieurs documents. Pour structurer cette information, de nombreuses approches s’appuient sur des graphes de connaissances où les passages textuels sont représentés comme des nœuds reliés par des relations lexicales, sémantiques ou symboliques. Dans ce contexte, ce papier propose EntEmbed, un encodeur conçu pour représenter un passage de manière conditionnelle à une entité spécifique qu’il contient. Cette représentation entité-centrée vise à capter les dimensions sémantiques associées à l’entité, tout en maintenant une contextualisation fine du passage. L’objectif est d’explorer comment ces représentations peuvent être construites et de les utiliser pour améliorer le raisonnement multi-saut dans les systèmes MD-QA.

pdf bib abs
Stimuler la Pensée Étudiante avec l’AQG : Vers une Génération Automatique de Questions de Type Étudiant
Abdelbassat Labeche | Sébastien Fournier
Actes de l'atelier Intelligence Artificielle générative et ÉDUcation : Enjeux, Défis et Perspectives de Recherche 2025 (IA-ÉDU)

Les systèmes de génération automatique de questions (AQG) sont largement utilisés dans les contextes éducatifs pour évaluer les connaissances. Ces systèmes se concentrent presque exclusivement sur des questions de type enseignant, structurées et factuelles. Cet article propose une approche novatrice, le Student-AQG, qui vise à simuler des questions spontanées qu’un étudiant réel pourrait poser, reflétant ses incompréhensions, sa curiosité ou ses besoins d’approfondissement. En nous appuyant sur les travaux récents en génération de questions autonomes, nous concevons un système modulaire basé sur des LLMs guidés par du prompt engineering, tenant compte du profil cognitif de l’apprenant. Nous décrivons une stratégie d’évaluation combinant des métriques automatiques et des annotations humaines sur la fluidité, la pertinence et la valeur pédagogique. Ce travail vise à aider les élèves à formuler des questions, développant ainsi leur pensée critique, une compétence essentielle souvent négligée à cause du faible questionnement spontané observé en classe.

pdf bib abs
Vers des RAGs intégrant véracité, subjectivité et explicabilité
Alae Bouchiba | Adrian-Gabriel Chifu | Sébastien Fournier | Lorraine Goeuriot | Philippe Mulhem
Actes de l'atelier Intelligence Artificielle générative et ÉDUcation : Enjeux, Défis et Perspectives de Recherche 2025 (IA-ÉDU)

Cet article introduit X-RAG-VS , un cadre pour intégrer véracité , subjectivité et explicabilité dans les systèmes RAG , en réponse aux besoins éducatifs. À travers des cas d’usage et l’analyse de modèles existants , nous montrons que ces dimensions restent insuffisamment prises en compte. Nous proposons une approche unifiée pour des réponses plus fiables , nuancées et explicables.

2022

pdf bib abs
DeepREF: A Framework for Optimized Deep Learning-based Relation Classification
Igor Nascimento | Rinaldo Lima | Adrian-Gabriel Chifu | Bernard Espinasse | Sébastien Fournier
Proceedings of the Thirteenth Language Resources and Evaluation Conference

The Relation Extraction (RE) is an important basic Natural Language Processing (NLP) for many applications, such as search engines, recommender systems, question-answering systems and others. There are many studies in this subarea of NLP that continue to be explored, such as SemEval campaigns (2010 to 2018), or DDI Extraction (2013).For more than ten years, different RE systems using mainly statistical models have been proposed as well as the frameworks to develop them. This paper focuses on frameworks allowing to develop such RE systems using deep learning models. Such frameworks should make it possible to reproduce experiments of various deep learning models and pre-processing techniques proposed in various publications. Currently, there are very few frameworks of this type, and we propose a new open and optimizable framework, called DeepREF, which is inspired by the OpenNRE and REflex existing frameworks. DeepREF allows the employment of various deep learning models, to optimize their use, to identify the best inputs and to get better results with each data set for RE and compare with other experiments, making ablation studies possible. The DeepREF Framework is evaluated on several reference corpora from various application domains.

2020

Natural Language Processing (NLP) of textual data is usually broken down into a sequence of several subtasks, where the output of one the subtasks becomes the input to the following one, which constitutes an NLP pipeline. Many third-party NLP tools are currently available, each performing distinct NLP subtasks. However, it is difficult to integrate several NLP toolkits into a pipeline due to many problems, including different input/output representations or formats, distinct programming languages, and tokenization issues. This paper presents DeepNLPF, a framework that enables easy integration of third-party NLP tools, allowing the user to preprocess natural language texts at lexical, syntactic, and semantic levels. The proposed framework also provides an API for complete pipeline customization including the definition of input/output formats, integration plugin management, transparent ultiprocessing execution strategies, corpus-level statistics, and database persistence. Furthermore, the DeepNLPF user-friendly GUI allows its use even by a non-expert NLP user. We conducted runtime performance analysis showing that DeepNLPF not only easily integrates existent NLP toolkits but also reduces significant runtime processing compared to executing the same NLP pipeline in a sequential manner.

2017

pdf bib abs
LSIS at SemEval-2017 Task 4: Using Adapted Sentiment Similarity Seed Words For English and Arabic Tweet Polarity Classification
Amal Htait | Sébastien Fournier | Patrice Bellot
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

We present, in this paper, our contribution in SemEval2017 task 4 : “Sentiment Analysis in Twitter”, subtask A: “Message Polarity Classification”, for English and Arabic languages. Our system is based on a list of sentiment seed words adapted for tweets. The sentiment relations between seed words and other terms are captured by cosine similarity between the word embedding representations (word2vec). These seed words are extracted from datasets of annotated tweets available online. Our tests, using these seed words, show significant improvement in results compared to the use of Turney and Littman’s (2003) seed words, on polarity classification of tweet messages.

2016

pdf bib abs
Bilbo-Val: Automatic Identification of Bibliographical Zone in Papers
Amal Htait | Sebastien Fournier | Patrice Bellot
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

In this paper, we present the automatic annotation of bibliographical references’ zone in papers and articles of XML/TEI format. Our work is applied through two phases: first, we use machine learning technology to classify bibliographical and non-bibliographical paragraphs in papers, by means of a model that was initially created to differentiate between the footnotes containing or not containing bibliographical references. The previous description is one of BILBO’s features, which is an open source software for automatic annotation of bibliographic reference. Also, we suggest some methods to minimize the margin of error. Second, we propose an algorithm to find the largest list of bibliographical references in the article. The improvement applied on our model results an increase in the model’s efficiency with an Accuracy equal to 85.89. And by testing our work, we are able to achieve 72.23% as an average for the percentage of success in detecting bibliographical references’ zone.

pdf bib
LSIS at SemEval-2016 Task 7: Using Web Search Engines for English and Arabic Unsupervised Sentiment Intensity Prediction
Amal Htait | Sebastien Fournier | Patrice Bellot
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)