Dimitar Dimitrov


2021

pdf bib
Detecting Propaganda Techniques in Memes
Dimitar Dimitrov | Bishr Bin Ali | Shaden Shaar | Firoj Alam | Fabrizio Silvestri | Hamed Firooz | Preslav Nakov | Giovanni Da San Martino
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Propaganda can be defined as a form of communication that aims to influence the opinions or the actions of people towards a specific goal; this is achieved by means of well-defined rhetorical and psychological devices. Propaganda, in the form we know it today, can be dated back to the beginning of the 17th century. However, it is with the advent of the Internet and the social media that propaganda has started to spread on a much larger scale than before, thus becoming major societal and political issue. Nowadays, a large fraction of propaganda in social media is multimodal, mixing textual with visual content. With this in mind, here we propose a new multi-label multimodal task: detecting the type of propaganda techniques used in memes. We further create and release a new corpus of 950 memes, carefully annotated with 22 propaganda techniques, which can appear in the text, in the image, or in both. Our analysis of the corpus shows that understanding both modalities together is essential for detecting these techniques. This is further confirmed in our experiments with several state-of-the-art multimodal models.

pdf bib
Detecting Harmful Memes and Their Targets
Shraman Pramanick | Dimitar Dimitrov | Rituparna Mukherjee | Shivam Sharma | Md. Shad Akhtar | Preslav Nakov | Tanmoy Chakraborty
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib
MOMENTA: A Multimodal Framework for Detecting Harmful Memes and Their Targets
Shraman Pramanick | Shivam Sharma | Dimitar Dimitrov | Md. Shad Akhtar | Preslav Nakov | Tanmoy Chakraborty
Findings of the Association for Computational Linguistics: EMNLP 2021

Internet memes have become powerful means to transmit political, psychological, and socio-cultural ideas. Although memes are typically humorous, recent days have witnessed an escalation of harmful memes used for trolling, cyberbullying, and abuse. Detecting such memes is challenging as they can be highly satirical and cryptic. Moreover, while previous work has focused on specific aspects of memes such as hate speech and propaganda, there has been little work on harm in general. Here, we aim to bridge this gap. In particular, we focus on two tasks: (i)detecting harmful memes, and (ii) identifying the social entities they target. We further extend the recently released HarMeme dataset, which covered COVID-19, with additional memes and a new topic: US politics. To solve these tasks, we propose MOMENTA (MultimOdal framework for detecting harmful MemEs aNd Their tArgets), a novel multimodal deep neural network that uses global and local perspectives to detect harmful memes. MOMENTA systematically analyzes the local and the global perspective of the input meme (in both modalities) and relates it to the background context. MOMENTA is interpretable and generalizable, and our experiments show that it outperforms several strong rivaling approaches.

pdf bib
SemEval-2021 Task 6: Detection of Persuasion Techniques in Texts and Images
Dimitar Dimitrov | Bishr Bin Ali | Shaden Shaar | Firoj Alam | Fabrizio Silvestri | Hamed Firooz | Preslav Nakov | Giovanni Da San Martino
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)

We describe SemEval-2021 task 6 on Detection of Persuasion Techniques in Texts and Images: the data, the annotation guidelines, the evaluation setup, the results, and the participating systems. The task focused on memes and had three subtasks: (i) detecting the techniques in the text, (ii) detecting the text spans where the techniques are used, and (iii) detecting techniques in the entire meme, i.e., both in the text and in the image. It was a popular task, attracting 71 registrations, and 22 teams that eventually made an official submission on the test set. The evaluation results for the third subtask confirmed the importance of both modalities, the text and the image. Moreover, some teams reported benefits when not just combining the two modalities, e.g., by using early or late fusion, but rather modeling the interaction between them in a joint model.