Evaluating Calibration of Arabic Pre-trained Language Models on Dialectal Text

Ali Al-Laith, Rachida Kebdani


Abstract
While pre-trained language models have made significant progress in different classification tasks, little attention has been given to the reliability of their confidence scores. Calibration, how well model confidence aligns with actual accuracy, is essential for real-world applications where decisions rely on probabilistic outputs. This study addresses this gap in Arabic dialect identification by assessing the calibration of eight pre-trained language models, ensuring their predictions are not only accurate but also reliable for practical applications. We analyze two datasets: one with over 1 million text samples and the Nuanced Arabic Dialect Identification dataset(NADI-2023). Using Expected Calibration Error (ECE) as a metric, we reveal substantial variation in model calibration across dialects in both datasets, showing that prediction confidence can vary significantly depending on regional data. This research has implications for improving the reliability of Arabic dialect models in applications like sentiment analysis and social media monitoring.
Anthology ID:
2025.wacl-1.8
Volume:
Proceedings of the 4th Workshop on Arabic Corpus Linguistics (WACL-4)
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Saad Ezzini, Hamza Alami, Ismail Berrada, Abdessamad Benlahbib, Abdelkader El Mahdaouy, Salima Lamsiyah, Hatim Derrouz, Amal Haddad Haddad, Mustafa Jarrar, Mo El-Haj, Ruslan Mitkov, Paul Rayson
Venues:
WACL | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
68–76
Language:
URL:
https://aclanthology.org/2025.wacl-1.8/
DOI:
Bibkey:
Cite (ACL):
Ali Al-Laith and Rachida Kebdani. 2025. Evaluating Calibration of Arabic Pre-trained Language Models on Dialectal Text. In Proceedings of the 4th Workshop on Arabic Corpus Linguistics (WACL-4), pages 68–76, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
Evaluating Calibration of Arabic Pre-trained Language Models on Dialectal Text (Al-Laith & Kebdani, WACL 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.wacl-1.8.pdf