GoldenWind at SemEval-2021 Task 5: Orthrus - An Ensemble Approach to Identify Toxicity

Marco Palomino, Dawid Grad, James Bedwell


Abstract
Many new developments to detect and mitigate toxicity are currently being evaluated. We are particularly interested in the correlation between toxicity and the emotions expressed in online posts. While toxicity may be disguised by amending the wording of posts, emotions will not. Therefore, we describe here an ensemble method to identify toxicity and classify the emotions expressed on a corpus of annotated posts published by Task 5 of SemEval 2021–our analysis shows that the majority of such posts express anger, sadness and fear. Our method to identify toxicity combines a lexicon-based approach, which on its own achieves an F1 score of 61.07%, with a supervised learning approach, which on its own achieves an F1 score of 60%. When both methods are combined, the ensemble achieves an F1 score of 66.37%.
Anthology ID:
2021.semeval-1.115
Volume:
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)
Month:
August
Year:
2021
Address:
Online
Editors:
Alexis Palmer, Nathan Schneider, Natalie Schluter, Guy Emerson, Aurelie Herbelot, Xiaodan Zhu
Venue:
SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
860–864
Language:
URL:
https://aclanthology.org/2021.semeval-1.115
DOI:
10.18653/v1/2021.semeval-1.115
Bibkey:
Cite (ACL):
Marco Palomino, Dawid Grad, and James Bedwell. 2021. GoldenWind at SemEval-2021 Task 5: Orthrus - An Ensemble Approach to Identify Toxicity. In Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021), pages 860–864, Online. Association for Computational Linguistics.
Cite (Informal):
GoldenWind at SemEval-2021 Task 5: Orthrus - An Ensemble Approach to Identify Toxicity (Palomino et al., SemEval 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.semeval-1.115.pdf
Data
Civil Comments