Haroun Elleuch

2025

TEDxTN: A Three-way Speech Translation Corpus for Code-Switched Tunisian Arabic - English
Fethi Bougares | Salima Mdhaffar | Haroun Elleuch | Yannick Estève
Proceedings of The Third Arabic Natural Language Processing Conference

In this paper, we introduce TEDxTN, the first publicly available Tunisian Arabic to English speech translation dataset. This work is in line with the ongoing effort to mitigate the data scarcity obstacle for a number of Arabic dialects. We collected, segmented, transcribed and translated 108 TEDx talks following our internally developed annotations guidelines. The. collected talks represent 25 hours of speech with code-switching that cover speakers with various accents from over 11 different regions of Tunisia. We make the annotation guidelines and corpus publicly available. This will enable the extension of TEDxTN to new talks as they become available. We also report results for strong baseline systems of Speech Recognition and Speech Translation using multiple pre-trained and fine-tuned end-to-end models. This corpus is the first open source and publicly available speech translation corpus of Code-Switching Tunisian dialect. We believe that this is a valuable resource that can motivate and facilitate further research studying Tunisian Dialect.

pdf bib abs

LIA and ELYADATA systems for the IWSLT 2025 low-resource speech translation shared task
Chaimae Chellaf | Haroun Elleuch | Othman Istaiteh | D. Fortuné KPONOU | Fethi Bougares | Yannick Estève | Salima Mdhaffar
Proceedings of the 22nd International Conference on Spoken Language Translation (IWSLT 2025)

In this paper, we present the approach and system setup of our participation in the IWSLT 2025 low-resource speech translation shared task. We submitted systems for three language pairs, namely Tunisian Arabic to English, North Levantine Arabic to English, and Fongbé to French. Both pipeline and end-to-end speech translation systems were explored for Tunisian Arabic to English and Fongbé to French pairs. However, only pipeline approaches were investigated for the North Levantine Arabic–English translation direction. All our submissions are based on the usage of pre-trained models that we further fine-tune with the shared task training data.

pdf bib

ELYADATA & LIA at NADI 2025: ASR and ADI Subtasks
Haroun Elleuch | Youssef Saidi | Salima Mdhaffar | Yannick Estève | Fethi Bougares
Proceedings of The Third Arabic Natural Language Processing Conference: Shared Tasks

2024

pdf bib abs

Performance Analysis of Speech Encoders for Low-Resource SLU and ASR in Tunisian Dialect
Salima Mdhaffar | Haroun Elleuch | Fethi Bougares | Yannick Estève
Proceedings of the Second Arabic Natural Language Processing Conference

Speech encoders pretrained through self-supervised learning (SSL) have demonstrated remarkable performance in various downstream tasks, including Spoken Language Understanding (SLU) and Automatic Speech Recognition (ASR). For instance, fine-tuning SSL models for such tasks has shown significant potential, leading to improvements in the SOTA performance across challenging datasets.In contrast to existing research, this paper contributes by comparing the effectiveness of SSL approaches in the context of (i) the low-resource Spoken Tunisian Arabic Dialect and (ii) its combination with a low-resource SLU and ASR scenario, where only a few semantic annotations are available for fine-tuning. We conducted experiments using many SSL speech encoders on the TARIC-SLU dataset. We used speech encoders that were pre-trained on either monolingual or multilingual speech data. Some of them have also been refined without in-domain nor Tunisian data through a multimodal supervised teacher-student learning. The study made in this paper yields numerous significant findings that we will discuss in the paper.

2023

pdf bib abs

ELYADATA at WojoodNER Shared Task: Data and Model-centric Approaches for Arabic Flat and Nested NER
Imen Laouirine | Haroun Elleuch | Fethi Bougares
Proceedings of ArabicNLP 2023

This paper describes our submissions to the WojoodNER shared task organized during the first ArabicNLP conference. We participated in the two proposed sub-tasks of flat and nested Named Entity Recognition (NER). Our systems were ranked first over eight and third over eleven in the Nested NER and Flat NER, respectively. All our primary submissions are based on DiffusionNER models (Shen et al., 2023), where the NER task is formulated as a boundary-denoising diffusion process. Experiments on nested WojoodNER achieves the best results with a micro F1-score of 93.73%. For the flat sub-task, our primary system was the third-best system, with a micro F1-score of 91.92%.

pdf bib abs

This paper describes the ON-TRAC consortium speech translation systems developed for IWSLT 2023 evaluation campaign. Overall, we participated in three speech translation tracks featured in the low-resource and dialect speech translation shared tasks, namely; i) spoken Tamasheq to written French, ii) spoken Pashto to written French, and iii) spoken Tunisian to written English. All our primary submissions are based on the end-to-end speech-to-text neural architecture using a pretrained SAMU-XLSR model as a speech encoder and a mbart model as a decoder. The SAMU-XLSR model is built from the XLS-R 128 in order to generate language agnostic sentence-level embeddings. This building is driven by the LaBSE model trained on multilingual text dataset. This architecture allows us to improve the input speech representations and achieve significant improvements compared to conventional end-to-end speech translation systems.