Gleb Shanshin

2026

ArabicMedicalBERT-QA-82 at AbjadMed: Fighting Class Imbalance in Arabic Medical Text Classification
Gleb Shanshin
Proceedings of the 2nd Workshop on NLP for Languages Using Arabic Script

We present a supervised system for Arabic medical question-answer classification developed for the AbjadMed shared task. The task involves assigning one of 82 highly imbalanced medical categories and is evaluated using macro-averaged F1. Our approach builds on an AraBERT model further pretrained on a related Arabic medical classification dataset. Under a unified fine-tuning setup, this domain-adapted model consistently outperforms general-purpose Arabic backbones, with the best results obtained using a low backbone learning rate, indicating that only limited adaptation is required. The final system achieves a macro F1 score of 0.51 on the private test split. For comparison, we evaluate several cost-efficient large language models under constrained prompting and observe substantially lower performance.

pdf bib abs

A Comparative Evaluation of Open-Source Models for Russian-Kazakh Translation
Gleb Shanshin
Proceedings for the Ninth Workshop on Technologies for Machine Translation of Low Resource Languages (LoResMT 2026)

We describe an evaluation of several open-source models under identical inference conditions without task-specific training. Despite covering a wide range of available models, including both multilingual systems and models specifically designed for Russian-Kazakh translation, the results indicate that the highest performance is achieved by the language-specific approach.

Co-authors

Venues

Fix author