SarcasmDet at SemEval-2021 Task 7: Detect Humor and Offensive based on Demographic Factors using RoBERTa Pre-trained Model

Dalya Faraj, Malak Abdullah


Abstract
This paper presents one of the top winning solution systems for task 7 at SemEval2021, HaHackathon: Detecting and Rating Humor and Offense. This competition is divided into two tasks, task1 with three sub-tasks 1a,1b, and 1c, and task2. The goal for task1 is to predict if the text would be considered humorous or not, and if it is yes, then predict how humorous it is and whether the humor rating would be perceived as controversial. The goal of the task2 is to predict how the text is considered offensive for users in general. Our solution has been developed using RoBERTa pre-trained model with ensemble techniques. The paper describes the submitted solution system’s architecture with the experiments and the hyperparameter tuning that led to this robust system. Our model ranked third and fourth places out of 50 teams in tasks 1c and 1a with F1-Score of 0.6270 and 0.9675, respectively. At the same time, the model ranked one of the top 10 models in task 1b and task 2 with an RMSE scores of 0.5446 and 0.4469, respectively.
Anthology ID:
2021.semeval-1.64
Volume:
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)
Month:
August
Year:
2021
Address:
Online
Venues:
ACL | IJCNLP | SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
527–533
Language:
URL:
https://aclanthology.org/2021.semeval-1.64
DOI:
10.18653/v1/2021.semeval-1.64
Bibkey:
Copy Citation:
PDF:
https://aclanthology.org/2021.semeval-1.64.pdf