Leveraging AI to Bridge Classical Arabic and Modern Standard Arabic for Text Simplification

Shatha Altammami


Abstract
This paper introduces the Hadith Simplification Dataset, a novel resource comprising 250 pairs of Classical Arabic (CA) Hadith texts and their simplified Modern Standard Arabic (MSA) equivalents. Addressing the lack of resources for simplifying culturally and religiously significant texts, this dataset bridges linguistic and accessibility gaps while preserving theological integrity. The simplifications were generated using a large language model and rigorously verified by an Islamic Studies expert to ensure precision and cultural sensitivity. By tackling the unique lexical, syntactic, and cultural challenges of CA-to-MSA transformation, this resource advances Arabic text simplification research. Beyond religious texts, the methodology developed is adaptable to other domains, such as poetry and historical literature. This work underscores the importance of ethical AI applications in preserving the integrity of religious texts while enhancing their accessibility to modern audiences.
Anthology ID:
2025.clrel-1.8
Volume:
Proceedings of the New Horizons in Computational Linguistics for Religious Texts
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Sane Yagi, Sane Yagi, Majdi Sawalha, Bayan Abu Shawar, Abdallah T. AlShdaifat, Norhan Abbas, Organizers
Venues:
CLRel | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
76–85
Language:
URL:
https://aclanthology.org/2025.clrel-1.8/
DOI:
Bibkey:
Cite (ACL):
Shatha Altammami. 2025. Leveraging AI to Bridge Classical Arabic and Modern Standard Arabic for Text Simplification. In Proceedings of the New Horizons in Computational Linguistics for Religious Texts, pages 76–85, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
Leveraging AI to Bridge Classical Arabic and Modern Standard Arabic for Text Simplification (Altammami, CLRel 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.clrel-1.8.pdf