Quranic Verses Semantic Relatedness Using AraBERT

Abdullah Alsaleh, Eric Atwell, Abdulrahman Altahhan


Abstract
Bidirectional Encoder Representations from Transformers (BERT) has gained popularity in recent years producing state-of-the-art performances across Natural Language Processing tasks. In this paper, we used AraBERT language model to classify pairs of verses provided by the QurSim dataset to either be semantically related or not. We have pre-processed The QurSim dataset and formed three datasets for comparisons. Also, we have used both versions of AraBERT, which are AraBERTv02 and AraBERTv2, to recognise which version performs the best with the given datasets. The best results was AraBERTv02 with 92% accuracy score using a dataset comprised of label ‘2’ and label '-1’, the latter was generated outside of QurSim dataset.
Anthology ID:
2021.wanlp-1.19
Volume:
Proceedings of the Sixth Arabic Natural Language Processing Workshop
Month:
April
Year:
2021
Address:
Kyiv, Ukraine (Virtual)
Editors:
Nizar Habash, Houda Bouamor, Hazem Hajj, Walid Magdy, Wajdi Zaghouani, Fethi Bougares, Nadi Tomeh, Ibrahim Abu Farha, Samia Touileb
Venue:
WANLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
185–190
Language:
URL:
https://aclanthology.org/2021.wanlp-1.19
DOI:
Bibkey:
Cite (ACL):
Abdullah Alsaleh, Eric Atwell, and Abdulrahman Altahhan. 2021. Quranic Verses Semantic Relatedness Using AraBERT. In Proceedings of the Sixth Arabic Natural Language Processing Workshop, pages 185–190, Kyiv, Ukraine (Virtual). Association for Computational Linguistics.
Cite (Informal):
Quranic Verses Semantic Relatedness Using AraBERT (Alsaleh et al., WANLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.wanlp-1.19.pdf