Degree based Classification of Harmful Speech using Twitter Data

Sanjana Sharma, Saksham Agrawal, Manish Shrivastava


Abstract
Harmful speech has various forms and it has been plaguing the social media in different ways. If we need to crackdown different degrees of hate speech and abusive behavior amongst it, the classification needs to be based on complex ramifications which needs to be defined and hold accountable for, other than racist, sexist or against some particular group and community. This paper primarily describes how we created an ontological classification of harmful speech based on degree of hateful intent and used it to annotate twitter data accordingly. The key contribution of this paper is the new dataset of tweets we created based on ontological classes and degrees of harmful speech found in the text. We also propose supervised classification system for recognizing these respective harmful speech classes in the texts hence. This serves as a preliminary work to lay down foundation on defining different classes of harmful speech and subsequent work will be done in making it’s automatic detection more robust and efficient.
Anthology ID:
W18-4413
Volume:
Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC-2018)
Month:
August
Year:
2018
Address:
Santa Fe, New Mexico, USA
Editors:
Ritesh Kumar, Atul Kr. Ojha, Marcos Zampieri, Shervin Malmasi
Venue:
TRAC
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
106–112
Language:
URL:
https://aclanthology.org/W18-4413
DOI:
Bibkey:
Cite (ACL):
Sanjana Sharma, Saksham Agrawal, and Manish Shrivastava. 2018. Degree based Classification of Harmful Speech using Twitter Data. In Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC-2018), pages 106–112, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
Cite (Informal):
Degree based Classification of Harmful Speech using Twitter Data (Sharma et al., TRAC 2018)
Copy Citation:
PDF:
https://aclanthology.org/W18-4413.pdf