A Semi-Supervised Approach to Detect Toxic Comments

Ghivvago Damas Saraiva; Rafael Anchiêta; Francisco Assis Ricarte Neto; Raimundo Moura

A Semi-Supervised Approach to Detect Toxic Comments

Ghivvago Damas Saraiva, Rafael Anchiêta, Francisco Assis Ricarte Neto, Raimundo Moura

Abstract

Toxic comments contain forms of non-acceptable language targeted towards groups or individuals. These types of comments become a serious concern for government organizations, online communities, and social media platforms. Although there are some approaches to handle non-acceptable language, most of them focus on supervised learning and the English language. In this paper, we deal with toxic comment detection as a semi-supervised strategy over a heterogeneous graph. We evaluate the approach on a toxic dataset of the Portuguese language, outperforming several graph-based methods and achieving competitive results compared to transformer architectures.

Anthology ID:: 2021.ranlp-1.142
Volume:: Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)
Month:: September
Year:: 2021
Address:: Held Online
Editors:: Ruslan Mitkov, Galia Angelova
Venue:: RANLP
SIG:
Publisher:: INCOMA Ltd.
Note:
Pages:: 1261–1267
Language:
URL:: https://aclanthology.org/2021.ranlp-1.142
DOI:
Bibkey:
Cite (ACL):: Ghivvago Damas Saraiva, Rafael Anchiêta, Francisco Assis Ricarte Neto, and Raimundo Moura. 2021. A Semi-Supervised Approach to Detect Toxic Comments. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), pages 1261–1267, Held Online. INCOMA Ltd..
Cite (Informal):: A Semi-Supervised Approach to Detect Toxic Comments (Saraiva et al., RANLP 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.ranlp-1.142.pdf
Code: rafaelanchieta/toxic
Data: ToLD-Br

PDF Cite Search Code