Noha Adly


2023

pdf bib
AlexU-AIC at WojoodNER shared task: Sequence Labeling vs MRC and SWA for Arabic Named Entity Recognition
Shereen Elkordi | Noha Adly | Marwan Torki
Proceedings of ArabicNLP 2023

Named entity recognition (NER) is one of many challenging tasks in Arabic Natural Language Processing. It is also the base of many critical downstream tasks to help understand the source of major trends and public opinion. In this paper, we will describe our submission in the NER Shared Task of ArabicNLP 2023. We used a simple machine reading comprehension-based technique in the Flat NER Subtask ranking eighth on the leaderboard, while we fine-tuned a language model for the Nested NER Subtask ranking third on the leaderboard.

2022

pdf bib
The AIC System for the WMT 2022 Unsupervised MT and Very Low Resource Supervised MT Task
Ahmad Shapiro | Mahmoud Salama | Omar Abdelhakim | Mohamed Fayed | Ayman Khalafallah | Noha Adly
Proceedings of the Seventh Conference on Machine Translation (WMT)

This paper presents our submissions to WMT 22 shared task in the Unsupervised and Very Low Resource Supervised Machine Translation tasks. The task revolves around translating between German ↔ Upper Sorbian (de ↔ hsb), German ↔ Lower Sorbian (de ↔ dsb) and Upper Sorbian ↔ Lower Sorbian (hsb ↔ dsb) in both unsupervised and supervised manner. For the unsupervised system, we trained an unsupervised phrase-based statistical machine translation (UPBSMT) system on each pair independently. We pretrained a De-Salvic mBART model on the following languages Polish (pl), Czech (cs), German (de), Upper Sorbian (hsb), Lower Sorbian (dsb). We then fine-tuned our mBART on the synthetic parallel data generated by the (UPBSMT) model along with authentic parallel data (de ↔ pl, de ↔ cs). We further fine-tuned our unsupervised system on authentic parallel data (hsb ↔ dsb, de ↔ dsb, de ↔ hsb) to submit our supervised low-resource system.

2006

pdf bib
Building a Heterogeneous Information Retrieval Collection of Printed Arabic Documents
Abdelrahim Abdelsapor | Noha Adly | Kareem Darwish | Ossama Emam | Walid Magdy | Magdi Nagi
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

This paper describes the development of an Arabic document image collection containing 34,651 documents from 1,378 different books and 25 topics with their relevance judgments. The books from which the collection is obtained are a part of a larger collection 75,000 books being scanned for archival and retrieval at the bibliotheca Alexandrina (BA). The documents in the collection vary widely in topics, fonts, and degradation levels. Initial baseline experiments were performed to examine the effectiveness of different index terms, with and without blind relevance feedback, on Arabic OCR degraded text.