Ashraf Elnagar


2024

pdf bib
AraCLIP: Cross-Lingual Learning for Effective Arabic Image Retrieval
Muhammad Al-Barham | Imad Afyouni | Khalid Almubarak | Ashraf Elnagar | Ayad Turky | Ibrahim Hashem
Proceedings of The Second Arabic Natural Language Processing Conference

This paper introduces Arabic Contrastive Language-Image Pre-training (AraCLIP), a model designed for Arabic image retrieval tasks, building upon the Contrastive Language-Image Pre-training (CLIP) architecture. AraCLIP leverages Knowledge Distillation to transfer cross-modal knowledge from English to Arabic, enhancing its ability to understand Arabic text and retrieve relevant images. Unlike existing multilingual models, AraCLIP is uniquely positioned to understand the intricacies of the Arabic language, including specific terms, cultural nuances, and contextual constructs. By leveraging the CLIP architecture as our foundation, we introduce a novel approach that seamlessly integrates textual and visual modalities, enabling AraCLIP to effectively retrieve images based on Arabic textual queries. We offer an online demonstration allowing users to input Arabic prompts and compare AraCLIP’s performance with state-of-the-art multilingual models. We conduct comprehensive experiments to evaluate AraCLIP’s performance across diverse datasets, including Arabic XTD-11, and Arabic Flicker 8k. Our results showcase AraCLIP’s superiority in image retrieval accuracy, demonstrating its effectiveness in handling Arabic queries. AraCLIP represents a significant advancement in cross-lingual image retrieval, offering promising applications in Arabic language processing and beyond.

2022

pdf bib
Arabic Image Captioning using Pre-training of Deep Bidirectional Transformers
Jonathan Emami | Pierre Nugues | Ashraf Elnagar | Imad Afyouni
Proceedings of the 15th International Conference on Natural Language Generation

2019

pdf bib
Automatic Text Tagging of Arabic News Articles Using Ensemble Deep Learning Models
Ashraf Elnagar | Omar Einea | Ridhwan Al-Debsi
Proceedings of the 3rd International Conference on Natural Language and Speech Processing