Olga Snissarenko at AbjadMed: Arabic Clinical Text Classification with AraBERT: Results from the AbjadMed Shared Task

Olga Snissarenko

Olga Snissarenko at AbjadMed: Arabic Clinical Text Classification with AraBERT: Results from the AbjadMed Shared Task

Abstract

We present a solution for the Arabic medical text classification task, formulated as a multi-class classification problem with 82 medical categories. The task is challenging due to severe class imbalance, long and heterogeneous input texts, and the presence of domain-specific medical terminology in Modern Standard Arabic. Our approach is based on fine-tuning pretrained AraBERT models with a focus on loss-level imbalance handling rather than architectural complexity. Through a systematic comparison of multiple AraBERT-based configurations, we show that class-weighted loss combined with simple mean pooling yields the strongest performance. Our best model achieves a macro-F1 score of 0.387 on the public evaluation set and 0.411 on the private test set.

Anthology ID:: 2026.abjadnlp-1.25
Volume:: Proceedings of the 2nd Workshop on NLP for Languages Using Arabic Script
Month:: March
Year:: 2026
Address:: Rabat, Morocco
Venues:: AbjadNLP | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 177–181
Language:
URL:: https://aclanthology.org/2026.abjadnlp-1.25/
DOI:
Bibkey:
Cite (ACL):: Olga Snissarenko. 2026. Olga Snissarenko at AbjadMed: Arabic Clinical Text Classification with AraBERT: Results from the AbjadMed Shared Task. In Proceedings of the 2nd Workshop on NLP for Languages Using Arabic Script, pages 177–181, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):: Olga Snissarenko at AbjadMed: Arabic Clinical Text Classification with AraBERT: Results from the AbjadMed Shared Task (Snissarenko, AbjadNLP 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.abjadnlp-1.25.pdf

PDF Cite Search Fix data