ImageEval 2025: The First Arabic Image Captioning Shared Task

Ahlam Bashiti; Alaa Aljabari; Hadi Khaled Hamoud; Md. Rafiul Biswas; Bilal Mohammed Shalash; Mustafa Jarrar; Fadi A. Zaraket; George Mikros; Ehsaneddin Asgari; Wajdi Zaghouani

doi:10.18653/v1/2025.arabicnlp-sharedtasks.52

ImageEval 2025: The First Arabic Image Captioning Shared Task

Ahlam Bashiti, Alaa Aljabari, Hadi Khaled Hamoud, Md. Rafiul Biswas, Bilal Mohammed Shalash, Mustafa Jarrar, Fadi Zaraket, George Mikros, Ehsaneddin Asgari, Wajdi Zaghouani

Abstract

We present ImageEval 2025, the first shared task dedicated to Arabic image captioning. The task addresses the critical gap in multimodal Arabic NLP by focusing on two complementary subtasks: (1) creating the first open-source, manually-captioned Arabic image dataset through a collaborative datathon, and (2) developing and evaluating Arabic image captioning models. A total of 44 teams registered, of which eight submitted during the test phase, producing 111 valid submissions. Evaluation was conducted using automatic metrics, LLM-based judgment, and human assessment. In Subtask 1, the best-performing system achieved a cosine similarity of 65.5, while in Subtask 2, the top score was 60.0. Although these results show encouraging progress, they also confirm that Arabic image captioning remains a challenging task, particularly due to cultural grounding requirements, morphological richness, and dialectal variation. All datasets, baseline models, and evaluation tools are released publicly to support future research in Arabic multimodal NLP.

Anthology ID:: 2025.arabicnlp-sharedtasks.52
Volume:: Proceedings of The Third Arabic Natural Language Processing Conference: Shared Tasks
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Kareem Darwish, Ahmed Ali, Ibrahim Abu Farha, Samia Touileb, Imed Zitouni, Ahmed Abdelali, Sharefah Al-Ghamdi, Sakhar Alkhereyf, Wajdi Zaghouani, Salam Khalifa, Badr AlKhamissi, Rawan Almatham, Injy Hamed, Zaid Alyafeai, Areeb Alowisheq, Go Inoue, Khalil Mrini, Waad Alshammari
Venue:: ArabicNLP
SIG:: SIGARAB
Publisher:: Association for Computational Linguistics
Note:
Pages:: 376–389
Language:
URL:: https://aclanthology.org/2025.arabicnlp-sharedtasks.52/
DOI:: 10.18653/v1/2025.arabicnlp-sharedtasks.52
Bibkey:
Cite (ACL):: Ahlam Bashiti, Alaa Aljabari, Hadi Khaled Hamoud, Md. Rafiul Biswas, Bilal Mohammed Shalash, Mustafa Jarrar, Fadi Zaraket, George Mikros, Ehsaneddin Asgari, and Wajdi Zaghouani. 2025. ImageEval 2025: The First Arabic Image Captioning Shared Task. In Proceedings of The Third Arabic Natural Language Processing Conference: Shared Tasks, pages 376–389, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: ImageEval 2025: The First Arabic Image Captioning Shared Task (Bashiti et al., ArabicNLP 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.arabicnlp-sharedtasks.52.pdf

PDF Cite Search Fix data