Towards A Friendly Online Community: An Unsupervised Style Transfer Framework for Profanity Redaction

Minh Tran, Yipeng Zhang, Mohammad Soleymani


Abstract
Offensive and abusive language is a pressing problem on social media platforms. In this work, we propose a method for transforming offensive comments, statements containing profanity or offensive language, into non-offensive ones. We design a Retrieve, Generate and Edit unsupervised style transfer pipeline to redact the offensive comments in a word-restricted manner while maintaining a high level of fluency and preserving the content of the original text. We extensively evaluate our method’s performance and compare it to previous style transfer models using both automatic metrics and human evaluations. Experimental results show that our method outperforms other models on human evaluations and is the only approach that consistently performs well on all automatic evaluation metrics.
Anthology ID:
2020.coling-main.190
Volume:
Proceedings of the 28th International Conference on Computational Linguistics
Month:
December
Year:
2020
Address:
Barcelona, Spain (Online)
Editors:
Donia Scott, Nuria Bel, Chengqing Zong
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
2107–2114
Language:
URL:
https://aclanthology.org/2020.coling-main.190
DOI:
10.18653/v1/2020.coling-main.190
Bibkey:
Cite (ACL):
Minh Tran, Yipeng Zhang, and Mohammad Soleymani. 2020. Towards A Friendly Online Community: An Unsupervised Style Transfer Framework for Profanity Redaction. In Proceedings of the 28th International Conference on Computational Linguistics, pages 2107–2114, Barcelona, Spain (Online). International Committee on Computational Linguistics.
Cite (Informal):
Towards A Friendly Online Community: An Unsupervised Style Transfer Framework for Profanity Redaction (Tran et al., COLING 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.coling-main.190.pdf