MLInitiative at CASE 2025: Multimodal Detection of Hate Speech, Humor,and Stance using Transformers

Ashish Acharya, Ankit Bk, Bikram K.c., Surabhi Adhikari, Rabin Thapa, Sandesh Shrestha, Tina Lama


Abstract
In recent years, memes have developed as popular forms of online satire and critique, artfully merging entertainment, social critique, and political discourse. On the other side, memes have also become a medium for the spread of hate speech, misinformation, and bigotry, especially towards marginalized communities, including the LGBTQ+ population. Solving this problem calls for the development of advanced multimodal systems that analyze the complex interplay between text and visuals in memes. This paper describes our work in the CASE@RANLP 2025 shared task. As a part of that task, we developed systems for hate speech detection, target identification, stance classification, and humor recognition within the text of memes. We investigate two multimodal transformer-based systems, ResNet-18 with BERT and SigLIP2, for these sub-tasks. Our results show that SigLIP-2 consistently outperforms the baseline, achieving an F1 score of 79.27 in hate speech detection, 72.88 in humor classification, and competitive performance in stance 60.59 and target detection 54.86. Through this study, we aim to contribute to the development of ethically grounded, inclusive NLP systems capable of interpreting complex sociolinguistic narratives in multi-modal content.
Anthology ID:
2025.case-1.11
Volume:
Proceedings of the 8th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Texts
Month:
September
Year:
2025
Address:
Varna, Bulgaria
Editors:
Ali Hürriyetoğlu, Hristo Tanev, Surendrabikram Thapa
Venues:
CASE | WS
SIG:
Publisher:
INCOMA Ltd., Shoumen, Bulgaria
Note:
Pages:
91–97
Language:
URL:
https://aclanthology.org/2025.case-1.11/
DOI:
Bibkey:
Cite (ACL):
Ashish Acharya, Ankit Bk, Bikram K.c., Surabhi Adhikari, Rabin Thapa, Sandesh Shrestha, and Tina Lama. 2025. MLInitiative at CASE 2025: Multimodal Detection of Hate Speech, Humor,and Stance using Transformers. In Proceedings of the 8th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Texts, pages 91–97, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
Cite (Informal):
MLInitiative at CASE 2025: Multimodal Detection of Hate Speech, Humor,and Stance using Transformers (Acharya et al., CASE 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.case-1.11.pdf