ArTST: Arabic Text and Speech Transformer

Hawau Olamide Toyin; Amirbek Djanibekov; Ajinkya Kulkarni; Hanan Aldarmaki

doi:10.18653/v1/2023.arabicnlp-1.5

ArTST: Arabic Text and Speech Transformer

Hawau Olamide Toyin, Amirbek Djanibekov, Ajinkya Kulkarni, Hanan Aldarmaki

Abstract

We present ArTST, a pre-trained Arabic text and speech transformer for supporting open-source speech technologies for the Arabic language. The model architecture follows the unified-modal framework, SpeechT5, that was recently released for English, and is focused on Modern Standard Arabic (MSA), with plans to extend the model for dialectal and code-switched Arabic in future editions. We pre-trained the model from scratch on MSA speech and text data, and fine-tuned it for the following tasks: Automatic Speech Recognition (ASR), Text-To-Speech synthesis (TTS), and spoken dialect identification. In our experiments comparing ArTST with SpeechT5, as well as with previously reported results in these tasks, ArTST performs on a par with or exceeding the current state-of-the-art in all three tasks. Moreover, we find that our pre-training is conducive for generalization, which is particularly evident in the low-resource TTS task. The pre-trained model as well as the fine-tuned ASR and TTS models are released for research use.

Anthology ID:: 2023.arabicnlp-1.5
Volume:: Proceedings of ArabicNLP 2023
Month:: December
Year:: 2023
Address:: Singapore (Hybrid)
Editors:: Hassan Sawaf, Samhaa El-Beltagy, Wajdi Zaghouani, Walid Magdy, Ahmed Abdelali, Nadi Tomeh, Ibrahim Abu Farha, Nizar Habash, Salam Khalifa, Amr Keleg, Hatem Haddad, Imed Zitouni, Khalil Mrini, Rawan Almatham
Venues:: ArabicNLP | WS
SIG:: SIGARAB
Publisher:: Association for Computational Linguistics
Note:
Pages:: 41–51
Language:
URL:: https://aclanthology.org/2023.arabicnlp-1.5/
DOI:: 10.18653/v1/2023.arabicnlp-1.5
Bibkey:
Cite (ACL):: Hawau Olamide Toyin, Amirbek Djanibekov, Ajinkya Kulkarni, and Hanan Aldarmaki. 2023. ArTST: Arabic Text and Speech Transformer. In Proceedings of ArabicNLP 2023, pages 41–51, Singapore (Hybrid). Association for Computational Linguistics.
Cite (Informal):: ArTST: Arabic Text and Speech Transformer (Toyin et al., ArabicNLP 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.arabicnlp-1.5.pdf

PDF Cite Search Fix data