Kristof Van Laerhoven


2025

pdf bib
Bilingual resources for Moroccan Sign Language Generation and Standard Arabic Skills Improvement of Deaf Children
Abdelhadi Soudi | Corinne Vinopol | Kristof Van Laerhoven
Proceedings of the 18th Workshop on Building and Using Comparable Corpora (BUCC)

This paper presents a set of bilingual Standard Arabic (SA)-Moroccan Sign Language (MSL) tools and resources to improve Moroccan Deaf children’s SA skills. An MSL Generator based on rule-based machine translation (MT) is described that enables users and educators of Deaf children, in particular, to enter Arabic text and generate its corresponding MSL translation in both graphic and video format. The generated graphics can be printed and imported into an Arabic reading passage. We have also developed MSL Clip and Create software that includes a bilingual database of 3,000 MSL signs and SA words, a Publisher for the incorporation of MSL graphic support into SA reading passages, and six Templates that create customized bilingual crossword puzzles, word searches, Bingo cards, matching games, flashcards, and fingerspelling scrambles. A crowdsourcing platform for MSL data collection is also described. A major social benefit of the development of these resources is in relation to equity and the status of deaf people in Moroccan society. More appropriate resources for the bilingual education of Deaf children (in MSL and SA) will lead to improved quality of educational services.

2024

pdf bib
Exploring the Potential of Large Language Models in Adaptive Machine Translation for Generic Text and Subtitles
Abdelhadi Soudi | Mohamed Hannani | Kristof Van Laerhoven | Eleftherios Avramidis
Proceedings of the 17th Workshop on Building and Using Comparable Corpora (BUCC) @ LREC-COLING 2024

pdf bib
Assessing the Performance of ChatGPT-4, Fine-tuned BERT and Traditional ML Models on Moroccan Arabic Sentiment Analysis
Mohamed Hannani | Abdelhadi Soudi | Kristof Van Laerhoven
Proceedings of the 4th International Conference on Natural Language Processing for Digital Humanities

Large Language Models (LLMs) have demonstrated impressive capabilities in various natural language processing tasks across different languages. However, their performance in low-resource languages and dialects, such as Moroccan Arabic (MA), requires further investigation. This study evaluates the performance of ChatGPT-4, different fine-tuned BERT models, FastText as text representation, and traditional machine learning models on MA sentiment analysis. Experiments were done on two open source MA datasets: an X(Twitter) Moroccan Arabic corpus (MAC) and a Moroccan Arabic YouTube corpus (MYC) datasets to assess their capabilities on sentiment text classification. We compare the performance of fully fine-tuned and pre-trained Arabic BERT-based models with ChatGPT-4 in zero-shot settings.