Can Attention-based Transformers Explain or Interpret Cyberbullying Detection?

Kanishk Verma, Tijana Milosevic, Brian Davis


Abstract
Automated textual cyberbullying detection is known to be a challenging task. It is sometimes expected that messages associated with bullying will either be a) abusive, b) targeted at a specific individual or group, or c) have a negative sentiment. Transfer learning by fine-tuning pre-trained attention-based transformer language models (LMs) has achieved near state-of-the-art (SOA) precision in identifying textual fragments as being bullying-related or not. This study looks closely at two SOA LMs, BERT and HateBERT, fine-tuned on real-life cyberbullying datasets from multiple social networking platforms. We intend to determine whether these finely calibrated pre-trained LMs learn textual cyberbullying attributes or syntactical features in the text. The results of our comprehensive experiments show that despite the fact that attention weights are drawn more strongly to syntactical features of the text at every layer, attention weights cannot completely account for the decision-making of such attention-based transformers.
Anthology ID:
2022.trac-1.3
Volume:
Proceedings of the Third Workshop on Threat, Aggression and Cyberbullying (TRAC 2022)
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Editors:
Ritesh Kumar, Atul Kr. Ojha, Marcos Zampieri, Shervin Malmasi, Daniel Kadar
Venue:
TRAC
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
16–29
Language:
URL:
https://aclanthology.org/2022.trac-1.3
DOI:
Bibkey:
Cite (ACL):
Kanishk Verma, Tijana Milosevic, and Brian Davis. 2022. Can Attention-based Transformers Explain or Interpret Cyberbullying Detection?. In Proceedings of the Third Workshop on Threat, Aggression and Cyberbullying (TRAC 2022), pages 16–29, Gyeongju, Republic of Korea. Association for Computational Linguistics.
Cite (Informal):
Can Attention-based Transformers Explain or Interpret Cyberbullying Detection? (Verma et al., TRAC 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.trac-1.3.pdf