Kangli Zhang


pdf bib
DD-TIG at SemEval-2022 Task 5: Investigating the Relationships Between Multimodal and Unimodal Information in Misogynous Memes Detection and Classification
Ziming Zhou | Han Zhao | Jingjing Dong | Ning Ding | Xiaolong Liu | Kangli Zhang
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)

This paper describes our submission for task 5 Multimedia Automatic Misogyny Identification (MAMI) at SemEval-2022. The task is designed to detect and classify misogynous memes. To utilize both textual and visual information presented in a meme, we investigate several of the most recent visual language transformer-based multimodal models and choose ERNIE-ViL-Large as our base model. For subtask A, with observations of models’ overfitting on unimodal patterns, strategies are proposed to mitigate problems of biased words and template memes. For subtask B, we transform this multi-label problem into a multi-class one and experiment with oversampling and complementary techniques. Our approach places 2nd for subtask A and 5th for subtask B in this competition.