Qihui Lin
2021
HITSZ-HLT at SemEval-2021 Task 5: Ensemble Sequence Labeling and Span Boundary Detection for Toxic Span Detection
Qinglin Zhu
|
Zijie Lin
|
Yice Zhang
|
Jingyi Sun
|
Xiang Li
|
Qihui Lin
|
Yixue Dang
|
Ruifeng Xu
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)
This paper presents the winning system that participated in SemEval-2021 Task 5: Toxic Spans Detection. This task aims to locate those spans that attribute to the text’s toxicity within a text, which is crucial for semi-automated moderation in online discussions. We formalize this task as the Sequence Labeling (SL) problem and the Span Boundary Detection (SBD) problem separately and employ three state-of-the-art models. Next, we integrate predictions of these models to produce a more credible and complement result. Our system achieves a char-level score of 70.83%, ranking 1/91. In addition, we also explore the lexicon-based method, which is strongly interpretable and flexible in practice.
Search
Co-authors
- Qinglin Zhu 1
- Zijie Lin 1
- Yice Zhang 1
- Jingyi Sun 1
- Xiang Li 1
- show all...