Interpreting Neural Network Hate Speech Classifiers

Interpreting Neural Network Hate Speech Classifiers Cindy Wang author 2018-10 text Proceedings of the 2nd Workshop on Abusive Language Online (ALW2) Darja Fišer editor Ruihong Huang editor Vinodkumar Prabhakaran editor Rob Voigt editor Zeerak Waseem editor Jacqueline Wernimont editor Association for Computational Linguistics Brussels, Belgium conference publication wang-2018-interpreting 10.18653/v1/W18-5111 https://aclanthology.org/W18-5111/ 2018-10 86 92