Prerana Rane


2025

pdf bib
MMFusion@CASE 2025: Attention-Based Multimodal Learning for Text-Image Content Analysis
Prerana Rane
Proceedings of the 8th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Texts

Text-embedded images, such as memes, are now increasingly common in social media discourse. These images combine visual and textual elements to convey complex attitudes and emotions. Deciphering the intent of these images is challenging due to their multimodal and context-dependent nature. This paper presents our approach to the Shared Task on Multimodal Hate, Humor, and Stance Detection in Marginalized Movement at CASE 2025. The shared task focuses on four key aspects of multimodal content analysis for text-embedded images: hate speech detection, target identification, stance classification, and humor recognition. We propose a multimodal learning framework that uses both textual and visual representations, along with cross-modal attention mechanisms, to classify content across all tasks effectively.
Search
Co-authors
    Venues
    Fix author