André Beyer


2024

pdf bib
ALADAN at IWSLT24 Low-resource Arabic Dialectal Speech Translation Task
Waad Ben Kheder | Josef Jon | André Beyer | Abdel Messaoudi | Rabea Affan | Claude Barras | Maxim Tychonov | Jean-Luc Gauvain
Proceedings of the 21st International Conference on Spoken Language Translation (IWSLT 2024)

This paper presents ALADAN’s approach to the IWSLT 2024 Dialectal and Low-resource shared task, focusing on Levantine Arabic (apc) and Tunisian Arabic (aeb) to English speech translation (ST). Addressing challenges such as the lack of standardized orthography and limited training data, we propose a solution for data normalization in Dialectal Arabic, employing a modified Levenshtein distance and Word2vec models to find orthographic variants of the same word. Our system consists of a cascade ST system integrating two ASR systems (TDNN-F and Zipformer) and two NMT modules derived from pre-trained models (NLLB-200 1.3B distilled model and CohereAI’s Command-R). Additionally, we explore the integration of unsupervised textual and audio data, highlighting the importance of multi-dialectal datasets for both ASR and NMT tasks. Our system achieves BLEU score of 31.5 for Levantine Arabic on the official validation set.