A Stacking-based Efficient Method for Toxic Language Detection on Live Streaming Chat

Yuto Oikawa, Yuki Nakayama, Koji Murakami


Abstract
In a live streaming chat on a video streaming service, it is crucial to filter out toxic comments with online processing to prevent users from reading comments in real-time. However, recent toxic language detection methods rely on deep learning methods, which can not be scalable considering inference speed. Also, these methods do not consider constraints of computational resources expected depending on a deployed system (e.g., no GPU resource).This paper presents an efficient method for toxic language detection that is aware of real-world scenarios. Our proposed architecture is based on partial stacking that feeds initial results with low confidence to meta-classifier. Experimental results show that our method achieves a much faster inference speed than BERT-based models with comparable performance.
Anthology ID:
2022.emnlp-industry.58
Volume:
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Industry Track
Month:
December
Year:
2022
Address:
Abu Dhabi, UAE
Editors:
Yunyao Li, Angeliki Lazaridou
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
571–578
Language:
URL:
https://aclanthology.org/2022.emnlp-industry.58
DOI:
10.18653/v1/2022.emnlp-industry.58
Bibkey:
Cite (ACL):
Yuto Oikawa, Yuki Nakayama, and Koji Murakami. 2022. A Stacking-based Efficient Method for Toxic Language Detection on Live Streaming Chat. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Industry Track, pages 571–578, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
A Stacking-based Efficient Method for Toxic Language Detection on Live Streaming Chat (Oikawa et al., EMNLP 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.emnlp-industry.58.pdf