Ailneni Rakshitha Rao
2022
ASRtrans at SemEval-2022 Task 4: Ensemble of Tuned Transformer-based Models for PCL Detection
Ailneni Rakshitha Rao
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)
Patronizing behavior is a subtle form of bullying and when directed towards vulnerable communities, it can arise inequalities. This paper describes our system for Task 4 of SemEval-2022: Patronizing and Condescending Language Detection (PCL). We participated in both the sub-tasks and conducted extensive experiments to analyze the effects of data augmentation and loss functions used, to tackle the problem of class imbalance. We explore whether large transformer-based models can capture the intricacies associated with PCL detection. Our solution consists of an ensemble of the RoBERTa model which is further trained on external data and other language models such as XLNeT, Ernie-2.0, and BERT. We also present the results of several problem transformation techniques such as Classifier Chains, Label Powerset, and Binary relevance for multi-label classification.
ASRtrans at SemEval-2022 Task 5: Transformer-based Models for Meme Classification
Ailneni Rakshitha Rao
|
Arjun Rao
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)
Women are frequently targeted online with hate speech and misogyny using tweets, memes, and other forms of communication. This paper describes our system for Task 5 of SemEval-2022: Multimedia Automatic Misogyny Identification (MAMI). We participated in both the sub-tasks, where we used transformer-based architecture to combine features of images and text. We explore models with multi-modal pre-training (VisualBERT) and text-based pre-training (MMBT) while drawing comparative results. We also show how additional training with task-related external data can improve the model performance. We achieved sizable improvements over baseline models and the official evaluation ranked our system 3rd out of 83 teams on the binary classification task (Sub-task A) with an F1 score of 0.761, and 7th out of 48 teams on the multi-label classification task (Sub-task B) with an F1 score of 0.705.