GoldenWind at SemEval-2021 Task 5: Orthrus - An Ensemble Approach to Identify Toxicity

Marco Palomino, Dawid Grad, James Bedwell


Abstract
Many new developments to detect and mitigate toxicity are currently being evaluated. We are particularly interested in the correlation between toxicity and the emotions expressed in online posts. While toxicity may be disguised by amending the wording of posts, emotions will not. Therefore, we describe here an ensemble method to identify toxicity and classify the emotions expressed on a corpus of annotated posts published by Task 5 of SemEval 2021–our analysis shows that the majority of such posts express anger, sadness and fear. Our method to identify toxicity combines a lexicon-based approach, which on its own achieves an F1 score of 61.07%, with a supervised learning approach, which on its own achieves an F1 score of 60%. When both methods are combined, the ensemble achieves an F1 score of 66.37%.
Anthology ID:
2021.semeval-1.115
Volume:
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)
Month:
August
Year:
2021
Address:
Online
Venues:
ACL | IJCNLP | SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
860–864
Language:
URL:
https://aclanthology.org/2021.semeval-1.115
DOI:
10.18653/v1/2021.semeval-1.115
Bibkey:
Copy Citation:
PDF:
https://aclanthology.org/2021.semeval-1.115.pdf