hate-alert@DravidianLangTech-ACL2022: Ensembling Multi-Modalities for Tamil TrollMeme Classification

Social media platforms often act as breeding grounds for various forms of trolling or malicious content targeting users or communities. One way of trolling users is by creating memes, which in most cases unites an image with a short piece of text embedded on top of it. The situation is more complex for multilingual(e.g., Tamil) memes due to the lack of benchmark datasets and models. We explore several models to detect Troll memes in Tamil based on the shared task, “Troll Meme Classification in DravidianLangTech2022” at ACL-2022. We observe while the text-based model MURIL performs better for Non-troll meme classification, the image-based model VGG16 performs better for Troll-meme classification. Further fusing these two modalities help us achieve stable outcomes in both classes. Our fusion model achieved a 0.561 weighted average F1 score and ranked second in this task.


Introduction
Over the past few years, social media platforms have been expanding rapidly.Users of the platform interact by sharing content to enrich their knowledge and social connections.Although most of the content on social media platforms that existed so far was textual, recently, a unique message was born: the meme.A meme is usually created by an image and a short piece of text on top of it, entrenched as part of the image.Memes are generally meant to be harmless and conceived to look humorous, but sometimes, bad actors use memes for threatening and abusing individuals or specific target communities.Such memes are collectively known as Offensive/Troll memes in social media.
Trolling is the exercise of publicizing a message via social media that is planned to be abusive, inciting, or threatening to distract, which often has rambling or off-topic content to provoke the audience (Bishop, 2014;Suryawanshi et al., 2020a).In addition, such memes can be treacherous as they The situation for countries like India is more complicated due to the immense lanuage diverisy1 .The meme in the Indian context, can be composed in English, local language (native or foreign script) or in combination of both language and script.This adds another challenge for the troll meme classification.
Recently, there has been a lot of effort to investi-  gate the malicious side of memes, e.g., focusing on hate (Gomez et al., 2020), offensive (Suryawanshi et al., 2020a), and harmful (Pramanick et al., 2021) memes.However, the majority of the studies are centralized around the English language.Further several shared tasks like HASOC 2021 (Modha et al., 2021), DravidianLangTech 2021(Chakravarthi et al., 2021), have been organized on multiple languages for hostile content detection in the Indian context, but it is limited to textual classification.Extending those tasks further, the organizer of this shared task has organized a classification task to identify troll memes in Tamil by providing 2,967 memes.This paper illustrates the methodologies we used to identify Tamil troll memes, which helped us achieve second place in the final leader-board standings of shared tasks.

Related Work
This section discusses some of the text-based abusive content detection methods and briefly explains the multi-modal techniques used so far to detect malicious memes.

Text-based abusive content detection
Recently, a lot of work has been carried out to identify abusive speech using text from social media posts (Das et al., 2020).In 2017, Davidson et al. (2017) made public a Twitter dataset in which thousands of tweets were labeled offensive, hate, and neither.The earlier efforts to create such classifiers used easy methods such as linguistic features, word n-grams, bag-of-words, etc (Davidson et al., 2017).With the availability of larger datasets, researchers have started utilizing complex models such as deep learning and graph embedding (Das et al., 2021b) strategies to improve the classifier performance of hate speech detection in social media posts.In 2018, Pitsilis et al. ( 2018) used deep learning-based models, such as the recurrent neural networks (RNNs), to detect the abusive tweets in the English language and witnessed that it was pretty effective in this task.In contrast, RNNs have been established to perform well with several language models.In addition, other neural network models, such as LSTM and CNN, have succeeded in detecting abusive speech (Goldberg, 2015;la Peña Sarracén et al., 2018).Recently, Transformer-based (Vaswani et al., 2017) language models such as BERT, (Devlin et al., 2019) are becoming quite prevalent in several downstream tasks, such as spam detection, classification (Das et al., 2021a;Banerjee et al., 2021), etc.Having observed the exceptional performance of these Transformer based models, we also utilize a Transformer based model, MURIL, which is pre-trained explicitly in Indian Languages.

Multi-modal abusive content detection
Lately, several datasets have been made public to the research community for abusive meme detection.Sabat et al.( 2019) created a dataset of 5,020 memes for hate speech detection.The MMHS150K hate meme dataset developed by Gomez et al.(2020) is one of the enormous datasets collected from Twitter, consisting of 150K posts.Similarly, Facebook AI (Kiela et al., 2020)   COCO (Pramanick et al., 2021;Chandra et al., 2021).
In this work, we use the VGG16 model, which is extensively used for several classification problems, to extract the features of all the memes and finally use it with the textual features to design our final model.

Dataset Description
The shared task on Troll Meme Classification in DravidianLangTech2022 (Suryawanshi et al., 2022) at ACL-2022 is based on a classification problem with the aim of moderating and minimizing the offensive/harmful content in social media.The objective of the shared task is to devise methodologies and vision-language models for troll meme detection in Tamil.We show the class distribution of the dataset (Suryawanshi et al., 2020b;Suryawanshi and Chakravarthi, 2021) in Table 1.The training set consisting of 2,300 memes (out of which 1,282 memes were labeled as troll meme) and the test set consisting of 667 memes.In addition, the latin transcribed texts were shared for all memes.We show example of both Troll and Non-troll memes in Figure 1.

Methodology
In this section, we discuss the different parts of the pipeline that we pursued for the detection of troll meme using the dataset.

Uni-modal Models
As part of our initial experiments, we created the following two uni-model models, one utilizing text features and the other using image-based features.MURIL: MURIL (Khanuja et al., 2021) is a transformer encoder having 12 layers with 12 attention heads and 768 dimensions.We used the pre-trained model which has been trained on 17 Indian languages and their transliterated counterparts using the MLM (masked language model) and the next sentence prediction (NSP) loss functions.The dataset used for pre-training is obtained by using the publicly available corpora from Wikipedia and Common Crawl.We pass all the texts associated with the meme via MURIL to get the 768dimensional feature vectors for each meme and then finally fed it to a output node for the final prediction.VGG16: VGG16 (Simonyan and Zisserman, 2014) is a Convolutional Neural Network architecture, a variant of the VGG model which consists of 16 layers and is very appealing because of its very uniform architecture.We pass all the images(meme) via VGG16 and get the 256-dimensional feature vectors, then we pass it to the two dense layer of size 256 (with dropout of 0.5), 64 and finally fed it two the output node for the final prediction.

Fusion Model
The uni-modal models we used so far do not use the relation between the text and image present in the meme.To have better understanding between the text and image, we design a new MURIL+VGG16 fusion classifier, where we first concatenate the embedding from the both MURIL and VGG16 models discussed above, then we pass the concatenated embedding to a classification node for the final prediction.The detail of the pipeline is presented in Figure 2.
All the models are trained with binary crossentropy loss functions and Adam optimizer for 20 epochs.

Results
Table 2 demonstrates the performance of each model.We observe among the uni-modal models, VGG16 has the highest Accuracy(MURIL: 0.556, VGG16: 0.587) and F1 score (MURIL: 0.637, VGG16: 0.736) for troll class.Though in terms of weighted F1 score(MURIL: 0.552, VGG16: 0.458), the text-based model MURIL performs better.When we fuse these two models, the fusion model achieves the highest weighted F1 score(0.561)among all the models.To further understand the model's weakness, we show the confusion matrix of each model in Figure 3.We observe that while the MURIL performs better on the Non-troll meme datapoints, VGG16 performs better on the troll meme datapoints.Whereas on the non-troll meme data points, VGG16 shows inferior performance.The fusion model brings the positive characteristics of both MURIL and VGG16 and performs the best by understanding better connections between the text and image of the memes.

Conclusion
In this shared task, we deal with a novel problem of detecting Tamil troll memes.We evaluated different uni-modal models and introduced a fusion model.We found that text-based model MURIL performs better on the Non-troll class, whereas VGG16 performs better on the Troll class.Ensembling these two models help us in gaining stable outcomes in both classes.We plan to explore further other vision-based models to improve classification performance as an immediate next step.
Figure 1: Examples of troll and not-troll meme

Figure 2 :
Figure 2: Our fusion model architecture with VGG16 and MURIL

Figure 3 :
Figure 3: Confusion Matrix on Test Data for Each Model

Table 2 :
Performance Comparisons of Each Model.
T: Troll Class.w: Weighted-Average.The best performance in each column is marked in bold and second best is underlined