NatiQ: An End-to-end Text-to-Speech System for Arabic

Ahmed Abdelali, Nadir Durrani, Cenk Demiroglu, Fahim Dalvi, Hamdy Mubarak, Kareem Darwish


Abstract
NatiQ is end-to-end text-to-speech system for Arabic. Our speech synthesizer uses an encoder-decoder architecture with attention. We used both tacotron-based models (tacotron- 1 and tacotron-2) and the faster transformer model for generating mel-spectrograms from characters. We concatenated Tacotron1 with the WaveRNN vocoder, Tacotron2 with the WaveGlow vocoder and ESPnet transformer with the parallel wavegan vocoder to synthesize waveforms from the spectrograms. We used in-house speech data for two voices: 1) neu- tral male “Hamza”- narrating general content and news, and 2) expressive female “Amina”- narrating children story books to train our models. Our best systems achieve an aver- age Mean Opinion Score (MOS) of 4.21 and 4.40 for Amina and Hamza respectively. The objective evaluation of the systems using word and character error rate (WER and CER) as well as the response time measured by real- time factor favored the end-to-end architecture ESPnet. NatiQ demo is available online at https://tts.qcri.org.
Anthology ID:
2022.wanlp-1.38
Volume:
Proceedings of the Seventh Arabic Natural Language Processing Workshop (WANLP)
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates (Hybrid)
Editors:
Houda Bouamor, Hend Al-Khalifa, Kareem Darwish, Owen Rambow, Fethi Bougares, Ahmed Abdelali, Nadi Tomeh, Salam Khalifa, Wajdi Zaghouani
Venue:
WANLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
394–398
Language:
URL:
https://aclanthology.org/2022.wanlp-1.38
DOI:
10.18653/v1/2022.wanlp-1.38
Bibkey:
Cite (ACL):
Ahmed Abdelali, Nadir Durrani, Cenk Demiroglu, Fahim Dalvi, Hamdy Mubarak, and Kareem Darwish. 2022. NatiQ: An End-to-end Text-to-Speech System for Arabic. In Proceedings of the Seventh Arabic Natural Language Processing Workshop (WANLP), pages 394–398, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
Cite (Informal):
NatiQ: An End-to-end Text-to-Speech System for Arabic (Abdelali et al., WANLP 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.wanlp-1.38.pdf