mucAI at WojoodNER 2024: Arabic Named Entity Recognition with Nearest Neighbor Search

Ahmed Abdou, Tasneem Mahmoud


Abstract
Named Entity Recognition (NER) is a task in Natural Language Processing (NLP) that aims to identify and classify entities in text into predefined categories.However, when applied to Arabic data, NER encounters unique challenges stemming from the language’s rich morphological inflections, absence of capitalization cues, and spelling variants, where a single word can comprise multiple morphemes.In this paper, we introduce Arabic KNN-NER, our submission to the Wojood NER Shared Task 2024 (ArabicNLP 2024). We have participated in the shared sub-task 1 Flat NER. In this shared sub-task, we tackle fine-grained flat-entity recognition for Arabic text, where we identify a single main entity and possibly zero or multiple sub-entities for each word.Arabic KNN-NER augments the probability distribution of a fine-tuned model with another label probability distribution derived from performing a KNN search over the cached training data. Our submission achieved 91% on the test set on the WojoodFine dataset, placing Arabic KNN-NER on top of the leaderboard for the shared task.
Anthology ID:
2024.arabicnlp-1.107
Volume:
Proceedings of The Second Arabic Natural Language Processing Conference
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Nizar Habash, Houda Bouamor, Ramy Eskander, Nadi Tomeh, Ibrahim Abu Farha, Ahmed Abdelali, Samia Touileb, Injy Hamed, Yaser Onaizan, Bashar Alhafni, Wissam Antoun, Salam Khalifa, Hatem Haddad, Imed Zitouni, Badr AlKhamissi, Rawan Almatham, Khalil Mrini
Venues:
ArabicNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
894–898
Language:
URL:
https://aclanthology.org/2024.arabicnlp-1.107
DOI:
10.18653/v1/2024.arabicnlp-1.107
Bibkey:
Cite (ACL):
Ahmed Abdou and Tasneem Mahmoud. 2024. mucAI at WojoodNER 2024: Arabic Named Entity Recognition with Nearest Neighbor Search. In Proceedings of The Second Arabic Natural Language Processing Conference, pages 894–898, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
mucAI at WojoodNER 2024: Arabic Named Entity Recognition with Nearest Neighbor Search (Abdou & Mahmoud, ArabicNLP-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.arabicnlp-1.107.pdf