Fréjus A. A. Laleye


2024

pdf bib
FFSTC: Fongbe to French Speech Translation Corpus
D. Fortuné Kponou | Fréjus A. A. Laleye | Eugène Cokou Ezin
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

In this paper, we introduce the Fongbe to French Speech Translation Corpus (FFSTC). This corpus encompasses approximately 31 hours of collected Fongbe language content, featuring both French transcriptions and corresponding Fongbe voice recordings. FFSTC represents a comprehensive dataset compiled through various collection methods and the efforts of dedicated individuals. Furthermore, we conduct baseline experiments using Fairseq’s transformer_s and conformer models to evaluate data quality and validity. Our results indicate a score BLEU of 8.96 for the transformer_s model and 8.14 for the conformer model, establishing a baseline for the FFSTC corpus.

2022

pdf bib
OFU@SMM4H’22: Mining Advent Drug Events Using Pretrained Language Models
Omar Adjali | Fréjus A. A. Laleye | Umang Aggarwal
Proceedings of The Seventh Workshop on Social Media Mining for Health Applications, Workshop & Shared Task

We describe in this paper our proposed systems for the Social Media Mining for Health 2022 shared task 1. In particular, we participated in the three sub-tasks, tasks that aim at extracting and processing Adverse Drug Events. We investigate different transformer-based pretrained models we fine-tuned on each task and proposed some improvement on the task of entity normalization.

2020

pdf bib
A French Medical Conversations Corpus Annotated for a Virtual Patient Dialogue System
Fréjus A. A. Laleye | Gaël de Chalendar | Antonia Blanié | Antoine Brouquet | Dan Behnamou
Proceedings of the Twelfth Language Resources and Evaluation Conference

Data-driven approaches for creating virtual patient dialogue systems require the availability of large data specific to the language,domain and clinical cases studied. Based on the lack of dialogue corpora in French for medical education, we propose an annotatedcorpus of dialogues including medical consultation interactions between doctor and patient. In this work, we detail the building processof the proposed dialogue corpus, describe the annotation guidelines and also present the statistics of its contents. We then conducted aquestion categorization task to evaluate the benefits of the proposed corpus that is made publicly available.

2019

pdf bib
Hybridation d’un agent conversationnel avec des plongements lexicaux pour la formation au diagnostic médical (Hybridization of a conversational agent with word embeddings for medical diagnostic training)
Fréjus A. A. Laleye | Gaël de Chalendar | Antoine Brouquet | Antonia Blanié | Dan Benhamou
Actes de la Conférence sur le Traitement Automatique des Langues Naturelles (TALN) PFIA 2019. Volume II : Articles courts

Dans le contexte médical, un patient ou médecin virtuel dialoguant permet de former les apprenants au diagnostic médical via la simulation de manière autonome. Dans ce travail, nous avons exploité les propriétés sémantiques capturées par les représentations distribuées de mots pour la recherche de questions similaires dans le système de dialogues d’un agent conversationnel médical. Deux systèmes de dialogues ont été créés et évalués sur des jeux de données collectées lors des tests avec les apprenants. Le premier système fondé sur la correspondance de règles de dialogue créées à la main présente une performance globale de 92% comme taux de réponses cohérentes sur le cas clinique étudié tandis que le second système qui combine les règles de dialogue et la similarité sémantique réalise une performance de 97% de réponses cohérentes en réduisant de 7% les erreurs de compréhension par rapport au système de correspondance de règles.