Problematic Webpage Identification: A Trilogy of Hatespeech, Search Engines and GPT

Ojasvin Sood, Sandipan Dandapat


Abstract
In this paper, we introduce a fine-tuned transformer-based model focused on problematic webpage classification to identify webpages promoting hate and violence of various forms. Due to the unavailability of labelled problematic webpage data, first we propose a novel webpage data collection strategy which leverages well-studied short-text hate speech datasets. We have introduced a custom GPT-4 few-shot prompt annotation scheme taking various webpage features to label the prohibitively expensive webpage annotation task. The resulting annotated data is used to build our problematic webpage classification model. We report the accuracy (87.6% F1-score) of our webpage classification model and conduct a detailed comparison of it against other state-of-the-art hate speech classification model on problematic webpage identification task. Finally, we have showcased the importance of various webpage features in identifying a problematic webpage.
Anthology ID:
2023.woah-1.13
Volume:
The 7th Workshop on Online Abuse and Harms (WOAH)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Yi-ling Chung, Paul R{\"ottger}, Debora Nozza, Zeerak Talat, Aida Mostafazadeh Davani
Venue:
WOAH
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
126–137
Language:
URL:
https://aclanthology.org/2023.woah-1.13
DOI:
10.18653/v1/2023.woah-1.13
Bibkey:
Cite (ACL):
Ojasvin Sood and Sandipan Dandapat. 2023. Problematic Webpage Identification: A Trilogy of Hatespeech, Search Engines and GPT. In The 7th Workshop on Online Abuse and Harms (WOAH), pages 126–137, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Problematic Webpage Identification: A Trilogy of Hatespeech, Search Engines and GPT (Sood & Dandapat, WOAH 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.woah-1.13.pdf