Ismaël Rousseau


2023

pdf bib
Exploring Social Sciences Archives with Explainable Document Linkage through Question Generation
Elie Antoine | Hyun Jung Kang | Ismaël Rousseau | Ghislaine Azémard | Frederic Bechet | Geraldine Damnati
Proceedings of the 7th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature

This paper proposes a new approach for exploring digitized humanities and social sciences collections based on explainable links built from questions. Our experiments show the quality of our automatically generated questions and their relevance in a local context as well as the originality of the links produced by embeddings based on these questions. Analyses have also been performed to understand the types of questions generated on our corpus, and the related uses that can enrich the exploration. The relationships between the co-references and the questions generated, and the answers extracted from the text were also discussed and open a path for future improvements for our system in their resolution.

pdf bib
Questionner pour expliquer: construction de liens explicites entre documents par la génération automatique de questions
Elie Antoine | Hyun Jung Kang | Ismaël Rousseau | Ghislaine Azémard | Frédéric Béchet | Géraldine Damnati
Actes de CORIA-TALN 2023. Actes de la 30e Conférence sur le Traitement Automatique des Langues Naturelles (TALN), volume 4 : articles déjà soumis ou acceptés en conférence internationale

Cette article présente une méthode d’exploration de documents basée sur la création d’un ensemble synthétique de questions et de réponses portant sur le corpus, ensemble qui est ensuite utilisé pour établir des liens explicables entre les documents. Nous menons une évaluation quantitative et qualitative des questions automatiquement générées en termes de leur forme et de leur pertinence pour l’exploration de la collection. De plus, nous présentons une étude quantitative des liens obtenus grâce à notre méthode sur une collection de document provenant d’archives numérisés.

pdf bib
Darbarer @ AutoMin2023: Transcription simplification for concise minute generation from multi-party conversations
Ismaël Rousseau | Loïc Fosse | Youness Dkhissi | Geraldine Damnati | Gwénolé Lecorvé
Proceedings of the 16th International Natural Language Generation Conference: Generation Challenges

This document reports the approach of our team Darbarer for the main task (Task A) of the AutoMin 2023 challenge. Our system is composed of four main modules. The first module relies on a text simplification model aiming at standardizing the utterances of the conversation and compressing the input in order to focus on informative content. The second module handles summarization by employing a straightforward segmentation strategy and a fine-tuned BART-based generative model. Then a titling module has been trained in order to propose a short description of each summarized block. Lastly, we apply a post-processing step aimed at enhancing readability through specific formatting rules. Our contributions lie in the first, third and last steps. Our system generates precise and concise minutes. We provide a detailed description of our modules, discuss the difficulty of evaluating their impact and propose an analysis of observed errors in our generated minutes.