R-R at AbjadAuthorID Shared Task: A Fine-Tuned Approach for Kurdish Authorship Identification

Rania Azad M. San Ahmed, Rebwar M. Nabi


Abstract
Authorship identification is a fundamental task in natural language processing and computational stylistics. Despite significant advancements in high-resource languages, lowresource languagesparticularly those utilizing non-Latin scriptsremain largely underexplored, leaving a critical gap in resources and benchmarks for this linguistically distinct, lowresource language. Addressing this oversight, this paper presents Task 3 of AbjadNLP 2026, the first shared task dedicated to authorship identification for Kurdish. The task introduces a newly constructed dataset designed to capture the unique phonological and orthographic features of Sorani Kurdish and formulate the task as a closed-set multiclass classification problem. To establish a robust baseline, we fine-tune the pretrained XLM-RoBERTa model to capture authorial, stylistic patterns. Experimental results on the test set demonstrate the efficacy of transformer-based representations for this domain, achieving an accuracy of approximately 75%.
Anthology ID:
2026.abjadnlp-1.67
Volume:
Proceedings of the 2nd Workshop on NLP for Languages Using Arabic Script
Month:
March
Year:
2026
Address:
Rabat, Morocco
Venues:
AbjadNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
525–529
Language:
URL:
https://aclanthology.org/2026.abjadnlp-1.67/
DOI:
Bibkey:
Cite (ACL):
Rania Azad M. San Ahmed and Rebwar M. Nabi. 2026. R-R at AbjadAuthorID Shared Task: A Fine-Tuned Approach for Kurdish Authorship Identification. In Proceedings of the 2nd Workshop on NLP for Languages Using Arabic Script, pages 525–529, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):
R-R at AbjadAuthorID Shared Task: A Fine-Tuned Approach for Kurdish Authorship Identification (Azad M. San Ahmed & Nabi, AbjadNLP 2026)
Copy Citation:
PDF:
https://aclanthology.org/2026.abjadnlp-1.67.pdf