Maimouna Ouattara


2026

This work focuses on neural machine translation between French and Mooré, leveraging the capabilities of Large Language Models (LLMs) in a low-resource language context. Mooré is a local language widely spoken in Burkina Faso but remains underrepresented in digital resources. Alongside Mooré, French, now a working language, remains widely used in administration, education, justice, etc. The coexistence of these two languages creates a growing demand for effective translation tools. However, Mooré, like many low-resource languages, poses significant challenges for machine translation due to the scarcity of parallel corpora and its complex morphology.The main objective of this work is to adapt LLMs for French–Mooré translation. Three pre-trained models were selected: No Language Left Behind (NLLB-200), mBART50, and AfroLM. A corpus of approximately 83,000 validated sentence pairs was compiled from an initial collection of 97,060 pairs through pre-processing, semantic filtering, and human evaluation. Specific adaptations to tokenizers and model architectures were applied to improve translation quality.The results show that the fine-tuned NLLB model outperforms the others, highlighting the importance of native language support. mBART50 achieves comparable performance after fine-tuning, while AfroLM remains less effective. Despite existing limitations, this study demonstrates the potential of fine-tuned LLMs for African low-resource languages.
Most of African low-resource languages are primarily spoken rather than written and lack large, standardized textual resources. In many communities, low literacy rates and limited access to formal education mean that text-based translation technologies alone are insufficient for effective communication. As a result, speech-to-speech translation systems play a crucial role by enabling direct and natural interaction across languages without requiring reading or writing skills. Such systems are essential for improving access to information, public services, healthcare, and education. The goal of our work is to build powerful transcription and speech synthesis models for Mooré language. Then, these models have been used to build a cascaded voice translation system between French and Mooré, since we already got a French-Mooré machine translation model. We collected Mooré audio-text pairs, reaching a total audio duration of 150 hours. Then, We fine-tuned Orpheus-3B and XTTS-v2 for speech synthesis and Wav2Vec-Bert-2.0 for transcription task. After fine-tuning and evaluation by 36 Mooré native speakers, XTTS-v2 achieved a MOS of 4.36 out of 5 compared to 3.47 out of 5 for Orpheus-3B. The UTMOS evaluation resulted in 3.47 out of 5 for XTTS-v2 and 2.80 out of 5 for Orpheus-3B. The A/B tests revealed that the evaluators preferred XTTS-v2 Mooré audios in 77.8% of cases compared to 22.2% for Orpheus-3B. After fine-tuning on Mooré, Wav2Vec-Bert-2.0 achieved a WER of 4.24% and a CER of 1.11%. Using these models, we successfully implemented a French-Mooré Speech-to-Speech Translation system.

2025

Position paper: In many African countries, the informal business sector represents the backbone of the economy, providing essential livelihoods and opportunities where formal employment is limited. Despite, however, the growing adoption of digital tools, entrepreneurs in this sector often face significant challenges due to lack of literacy and language barriers. These barriers not only limit accessibility but also increase the risk of fraud and financial insecurity. This position paper explores the potential of conversational agents (CAs) adapted to low-resource languages (LRLs), focusing specifically on Mooré, a language widely spoken in Burkina Faso. By enabling natural language interactions in local languages, AI-driven conversational agents offer a promising solution to enable informal traders to manage their financial transactions independently, thus promoting greater autonomy and security in business, while providing a step towards formalization of their business. Our study examines the main challenges in developing AI for African languages, including data scarcity and linguistic diversity, and reviews viable strategies for addressing them, such as cross-lingual transfer learning and data augmentation techniques.