YNU-HPCC at SemEval-2021 Task 5: Using a Transformer-based Model with Auxiliary Information for Toxic Span Detection

Ruijun Chen, Jin Wang, Xuejie Zhang


Abstract
Toxic span detection requires the detection of spans that make a text toxic instead of simply classifying the text. In this paper, a transformer-based model with auxiliary information is proposed for SemEval-2021 Task 5. The proposed model was implemented based on the BERT-CRF architecture. It consists of three parts: a transformer-based model that can obtain the token representation, an auxiliary information module that combines features from different layers, and an output layer used for the classification. Various BERT-based models, such as BERT, ALBERT, RoBERTa, and XLNET, were used to learn contextual representations. The predictions of these models were assembled to improve the sequence labeling tasks by using a voting strategy. Experimental results showed that the introduced auxiliary information can improve the performance of toxic spans detection. The proposed model ranked 5th of 91 in the competition. The code of this study is available at https://github.com/Chenrj233/semeval2021_task5
Anthology ID:
2021.semeval-1.112
Volume:
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)
Month:
August
Year:
2021
Address:
Online
Editors:
Alexis Palmer, Nathan Schneider, Natalie Schluter, Guy Emerson, Aurelie Herbelot, Xiaodan Zhu
Venue:
SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
841–845
Language:
URL:
https://aclanthology.org/2021.semeval-1.112
DOI:
10.18653/v1/2021.semeval-1.112
Bibkey:
Cite (ACL):
Ruijun Chen, Jin Wang, and Xuejie Zhang. 2021. YNU-HPCC at SemEval-2021 Task 5: Using a Transformer-based Model with Auxiliary Information for Toxic Span Detection. In Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021), pages 841–845, Online. Association for Computational Linguistics.
Cite (Informal):
YNU-HPCC at SemEval-2021 Task 5: Using a Transformer-based Model with Auxiliary Information for Toxic Span Detection (Chen et al., SemEval 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.semeval-1.112.pdf
Code
 chenrj233/semeval2021_task5