DialG2P: Dialectal Grapheme-to-Phoneme. Arabic as a Case Study

Majd Hawasly, Hamdy Mubarak, Ahmed Abdelali, Ahmed Ali


Abstract
Grapheme-to-phoneme (G2P) models are essential components in text-to-speech (TTS) and pronunciation assessment applications. While standard forms of languages have gained attention in that regard, dialectal speech, which often serves as the primary means of spoken communication for many communities, as it is the case for Arabic, has not received the same level of focus. In this paper, we introduce an end-to-end dialectal G2P for Egyptian Arabic, a dialect without standard orthography. Our novel architecture accomplishes three tasks: (i) restores short vowels of the diacritical marks for the dialectal text; (ii) maps certain characters that happen only in the spoken version of the dialectal Arabic to their dialect-specific character transcriptions; and finally (iii) converts the previous step output to the corresponding phoneme sequence. We benchmark G2P on a modular cascaded system, a large language model, and our multi-task end-to-end architecture.
Anthology ID:
2025.arabicnlp-main.38
Volume:
Proceedings of The Third Arabic Natural Language Processing Conference
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Kareem Darwish, Ahmed Ali, Ibrahim Abu Farha, Samia Touileb, Imed Zitouni, Ahmed Abdelali, Sharefah Al-Ghamdi, Sakhar Alkhereyf, Wajdi Zaghouani, Salam Khalifa, Badr AlKhamissi, Rawan Almatham, Injy Hamed, Zaid Alyafeai, Areeb Alowisheq, Go Inoue, Khalil Mrini, Waad Alshammari
Venue:
ArabicNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
466–471
Language:
URL:
https://aclanthology.org/2025.arabicnlp-main.38/
DOI:
Bibkey:
Cite (ACL):
Majd Hawasly, Hamdy Mubarak, Ahmed Abdelali, and Ahmed Ali. 2025. DialG2P: Dialectal Grapheme-to-Phoneme. Arabic as a Case Study. In Proceedings of The Third Arabic Natural Language Processing Conference, pages 466–471, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
DialG2P: Dialectal Grapheme-to-Phoneme. Arabic as a Case Study (Hawasly et al., ArabicNLP 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.arabicnlp-main.38.pdf