Youssef Zaghloul

2026

QalamID at AbjadAuthorID Shared Task: Morphology Matters, A Hybrid Ensemble for Arabic Authorship Attribution
Youssef Zaghloul
Proceedings of the 2nd Workshop on NLP for Languages Using Arabic Script

Arabic authorship attribution presents unique challenges due to the language’s rich derivational morphology, which often fragments word-level frequencies. In this paper, we describe our winning submission to the AbjadAuthorID Shared Task. We propose a hybrid ensemble system that fuses the morphological precision of character n-gram LinearSVCs with the semantic understanding of fine-tuned Transformers (AraBERT and XLM-RoBERTa). Contrary to current trends in NLP, we demonstrate that traditional character n-grams (0.92 F1) significantly outperform deep learning baselines (AraBERT 0.87 F1) for this task, suggesting that authorial signature in Arabic is encoded more densely in morphological patterns than in semantic content. Our final system employs a novel Precision Scalpel post-hoc calibration technique and selective pseudo-labeling to address class imbalance and genre confounds. The system achieved the 1st place ranking with a macro F1-score of 0.932 and accuracy of 0.963 on the test set.

Co-authors

Venues

AbjadNLP1
WS1

Fix author