To Block or not to Block: Experiments with Machine Learning for News Comment Moderation

Damir Korencic, Ipek Baris, Eugenia Fernandez, Katarina Leuschel, Eva Sánchez Salido


Abstract
Today, news media organizations regularly engage with readers by enabling them to comment on news articles. This creates the need for comment moderation and removal of disallowed comments – a time-consuming task often performed by human moderators. In this paper we approach the problem of automatic news comment moderation as classification of comments into blocked and not blocked categories. We construct a novel dataset of annotated English comments, experiment with cross-lingual transfer of comment labels and evaluate several machine learning models on datasets of Croatian and Estonian news comments. Team name: SuperAdmin; Challenge: Detection of blocked comments; Tools/models: CroSloEn BERT, FinEst BERT, 24Sata comment dataset, Ekspress comment dataset.
Anthology ID:
2021.hackashop-1.18
Volume:
Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation
Month:
April
Year:
2021
Address:
Online
Editors:
Hannu Toivonen, Michele Boggia
Venue:
Hackashop
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
127–133
Language:
URL:
https://aclanthology.org/2021.hackashop-1.18
DOI:
Bibkey:
Cite (ACL):
Damir Korencic, Ipek Baris, Eugenia Fernandez, Katarina Leuschel, and Eva Sánchez Salido. 2021. To Block or not to Block: Experiments with Machine Learning for News Comment Moderation. In Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation, pages 127–133, Online. Association for Computational Linguistics.
Cite (Informal):
To Block or not to Block: Experiments with Machine Learning for News Comment Moderation (Korencic et al., Hackashop 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.hackashop-1.18.pdf