TSR@CASE 2025: Low Dimensional Multimodal Fusion Using Multiplicative Fine Tuning Modules

Sushant Kr. Ray, Rafiq Ali, Abdullah Mohammad, Ebad Shabbir, Samar Wazir


Abstract
This study describes our submission to the CASE 2025 shared task on multimodal hate event detection, which focuses on hate detection, hate target identification, stance determination, and humour detection on text embedded images as classification challenges. Our submission contains entries in all of the subtasks. We propose FIMIF, a lightweight and efficient classification model that leverages frozen CLIP encoders. We utilise a feature interaction module that allows the model to exploit multiplicative interactions between features without any manual engineering. Our results demonstrate that the model achieves comparable or superior performance to larger models, despite having a significantly smaller parameter count
Anthology ID:
2025.case-1.15
Volume:
Proceedings of the 8th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Texts
Month:
September
Year:
2025
Address:
Varna, Bulgaria
Editors:
Ali Hürriyetoğlu, Hristo Tanev, Surendrabikram Thapa
Venues:
CASE | WS
SIG:
Publisher:
INCOMA Ltd., Shoumen, Bulgaria
Note:
Pages:
123–132
Language:
URL:
https://aclanthology.org/2025.case-1.15/
DOI:
Bibkey:
Cite (ACL):
Sushant Kr. Ray, Rafiq Ali, Abdullah Mohammad, Ebad Shabbir, and Samar Wazir. 2025. TSR@CASE 2025: Low Dimensional Multimodal Fusion Using Multiplicative Fine Tuning Modules. In Proceedings of the 8th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Texts, pages 123–132, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
Cite (Informal):
TSR@CASE 2025: Low Dimensional Multimodal Fusion Using Multiplicative Fine Tuning Modules (Ray et al., CASE 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.case-1.15.pdf