Walker Stanislas Rocksane COMPAORE


2026

This work focuses on neural machine translation between French and Mooré, leveraging the capabilities of Large Language Models (LLMs) in a low-resource language context. Mooré is a local language widely spoken in Burkina Faso but remains underrepresented in digital resources. Alongside Mooré, French, now a working language, remains widely used in administration, education, justice, etc. The coexistence of these two languages creates a growing demand for effective translation tools. However, Mooré, like many low-resource languages, poses significant challenges for machine translation due to the scarcity of parallel corpora and its complex morphology.The main objective of this work is to adapt LLMs for French–Mooré translation. Three pre-trained models were selected: No Language Left Behind (NLLB-200), mBART50, and AfroLM. A corpus of approximately 83,000 validated sentence pairs was compiled from an initial collection of 97,060 pairs through pre-processing, semantic filtering, and human evaluation. Specific adaptations to tokenizers and model architectures were applied to improve translation quality.The results show that the fine-tuned NLLB model outperforms the others, highlighting the importance of native language support. mBART50 achieves comparable performance after fine-tuning, while AfroLM remains less effective. Despite existing limitations, this study demonstrates the potential of fine-tuned LLMs for African low-resource languages.