Automated Authentication of Quranic Verses Using BERT (Bidirectional Encoder Representations from Transformers) based Language Models

Khubaib Amjad Alam, Maryam Khalid, Syed Ahmed Ali, Haroon Mahmood, Qaisar Shafi, Muhammad Haroon, Zulqarnain Haider


Abstract
The proliferation of Quranic content on digital platforms, including websites and social media, has brought about significant challenges in verifying the authenticity of Quranic verses. The inherent complexity of the Arabic language, with its rich morphology, syntax, and semantics, makes traditional text-processing techniques inadequate for robust authentication. This paper addresses this problem by leveraging state-of-the-art transformer-based Language models tailored for Arabic text processing. Our approach involves fine-tuning three transformer architectures BERT-Base-Arabic, AraBERT, and MarBERT on a curated dataset containing both authentic and non-authentic verses. Non-authentic examples were created using sentence-BERT, which applies cosine similarity to introduce subtle modifications. Comprehensive experiments were conducted to evaluate the performance of the models. Among the three candidate models, MarBERT, which is specifically designed for handling Arabic dialects demonstrated superior performance, achieving an F1-score of 93.80%. BERT-Base-Arabic also showed competitive F1 score of 92.90% reflecting its robust understanding of Arabic text. The findings underscore the potential of transformer-based models in addressing linguistic complexities inherent in Quranic text and pave the way for developing automated, reliable tools for Quranic verse authentication in the digital era.
Anthology ID:
2025.clrel-1.6
Volume:
Proceedings of the New Horizons in Computational Linguistics for Religious Texts
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Sane Yagi, Sane Yagi, Majdi Sawalha, Bayan Abu Shawar, Abdallah T. AlShdaifat, Norhan Abbas, Organizers
Venues:
CLRel | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
59–66
Language:
URL:
https://aclanthology.org/2025.clrel-1.6/
DOI:
Bibkey:
Cite (ACL):
Khubaib Amjad Alam, Maryam Khalid, Syed Ahmed Ali, Haroon Mahmood, Qaisar Shafi, Muhammad Haroon, and Zulqarnain Haider. 2025. Automated Authentication of Quranic Verses Using BERT (Bidirectional Encoder Representations from Transformers) based Language Models. In Proceedings of the New Horizons in Computational Linguistics for Religious Texts, pages 59–66, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
Automated Authentication of Quranic Verses Using BERT (Bidirectional Encoder Representations from Transformers) based Language Models (Alam et al., CLRel 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.clrel-1.6.pdf