Investigating Large Language Models’ (LLMs) Capabilities for Sexism Detection on a Low-Resource Language

Lutfiye Seda Mut Altin, Horacio Saggion


Abstract
Automatic detection of sexist language on social media is gaining attention due to its harmful societal impact and technical challenges it presents. The limited availability of data resources in some languages restricts the development of effective tools to fight the spread of such content. In this work, we investigated various methods to improve the efficiency of automatic detection of sexism and its subtypes in a low-resource language, Turkish. We first experimented with various LLM prompting strategies for classification and then investigated the impact of different data augmentation strategies, including both synthetic data generation with LLMs (GPT, DeepSeek) and translationbased augmentation using English and Spanish data. Finally, we examined whether these augmentation methods would improve model performance of a trained neural network (BERT). Our benchmarking results show that fine-tuned LLM (GPT-4o-mini) achieved the best performance compared to zero-shot, few-shot, Chain-of-Thought prompt classification and training a neural network (BERT) including the data augmented in different ways (synthetic generation, translation). Our results also indicated that, for the classification of more granular classes, in other words, more specific tasks, training a neural network generally performed better than prompt-based classification using an LLM.
Anthology ID:
2025.ranlp-1.89
Volume:
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era
Month:
September
Year:
2025
Address:
Varna, Bulgaria
Editors:
Galia Angelova, Maria Kunilovskaya, Marie Escribe, Ruslan Mitkov
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd., Shoumen, Bulgaria
Note:
Pages:
771–779
Language:
URL:
https://aclanthology.org/2025.ranlp-1.89/
DOI:
Bibkey:
Cite (ACL):
Lutfiye Seda Mut Altin and Horacio Saggion. 2025. Investigating Large Language Models’ (LLMs) Capabilities for Sexism Detection on a Low-Resource Language. In Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era, pages 771–779, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
Cite (Informal):
Investigating Large Language Models’ (LLMs) Capabilities for Sexism Detection on a Low-Resource Language (Mut Altin & Saggion, RANLP 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.ranlp-1.89.pdf