LoRAD: Low-Resource AI-Generated Text Detection with XLM-RoBERTa

Ali Zain


Abstract
This paper describes our system submitted to the AbjadGenEval Shared Task at ArabicNLP 2026, which focuses on binary classification of human-written versus machine-generated text in low-resource languages. We participated in two independent subtasks targeting Arabic and Urdu news and literary texts. Our approach relies exclusively on fine-tuning XLM-RoBERTa, a multilingual Transformer-based model, under carefully controlled training and preprocessing settings. While the same model architecture was used for both subtasks, language-specific data handling strategies were applied based on empirical observations. The proposed system achieved first place in the Urdu subtask and third place in the Arabic subtask according to the official evaluation. These results demonstrate that multilingual pretrained models can serve as strong and reliable systems for AI-generated text detection across diverse languages.
Anthology ID:
2026.abjadnlp-1.57
Volume:
Proceedings of the 2nd Workshop on NLP for Languages Using Arabic Script
Month:
March
Year:
2026
Address:
Rabat, Morocco
Venues:
AbjadNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
468–471
Language:
URL:
https://aclanthology.org/2026.abjadnlp-1.57/
DOI:
Bibkey:
Cite (ACL):
Ali Zain. 2026. LoRAD: Low-Resource AI-Generated Text Detection with XLM-RoBERTa. In Proceedings of the 2nd Workshop on NLP for Languages Using Arabic Script, pages 468–471, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):
LoRAD: Low-Resource AI-Generated Text Detection with XLM-RoBERTa (Zain, AbjadNLP 2026)
Copy Citation:
PDF:
https://aclanthology.org/2026.abjadnlp-1.57.pdf