Sujith Kanakkassery


2026

This paper describes our system submitted to the AbjadMed 2026 shared task at AbjadNLP. The task focuses on the multi-class classification of Arabic medical texts under severe class imbalance. Our approach fine-tunes a pre-trained Arabic Transformer model and incorporates several imbalance-aware strategies, including data cleaning, class-weighted loss, and label smoothing. Through ablation experiments, we observe consistent improvements over a baseline system, demonstrating the effectiveness of these techniques in improving performance on underrepresented medical categories. Finally, our error analysis highlights persistent challenges related to label sparsity and semantic overlap among medical classes.
Search
Co-authors
    Venues
    Fix author