Towards Automatic Generation of Messages Countering Online Hate Speech and Microaggressions

Mana Ashida, Mamoru Komachi


Abstract
With the widespread use of social media, online hate is increasing, and microaggressions are receiving attention. We explore the potential for using pretrained language models to automatically generate messages that combat the associated offensive texts. Specifically, we focus on using prompting to steer model generation as it requires less data and computation than fine-tuning. We also propose a human evaluation perspective; offensiveness, stance, and informativeness. After obtaining 306 counterspeech and 42 microintervention messages generated by GPT-2, 3, Neo, we conducted a human evaluation using Amazon Mechanical Turk. The results indicate the potential of using prompting in the proposed generation task. All the generated texts along with the annotation are published to encourage future research on countering hate and microaggressions online.
Anthology ID:
2022.woah-1.2
Volume:
Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)
Month:
July
Year:
2022
Address:
Seattle, Washington (Hybrid)
Editors:
Kanika Narang, Aida Mostafazadeh Davani, Lambert Mathias, Bertie Vidgen, Zeerak Talat
Venue:
WOAH
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
11–23
Language:
URL:
https://aclanthology.org/2022.woah-1.2
DOI:
10.18653/v1/2022.woah-1.2
Bibkey:
Cite (ACL):
Mana Ashida and Mamoru Komachi. 2022. Towards Automatic Generation of Messages Countering Online Hate Speech and Microaggressions. In Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH), pages 11–23, Seattle, Washington (Hybrid). Association for Computational Linguistics.
Cite (Informal):
Towards Automatic Generation of Messages Countering Online Hate Speech and Microaggressions (Ashida & Komachi, WOAH 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.woah-1.2.pdf
Video:
 https://aclanthology.org/2022.woah-1.2.mp4
Code
 tmu-nlp/chasm
Data
CONANSBIC