Selam@DravidianLangTech 2024:Identifying Hate Speech and Offensive Language

Selam Abitte Kanta, Grigori Sidorov, Alexander Gelbukh


Abstract
Social media has transformed into a powerful tool for sharing information while upholding the principle of free expression. However, this open platform has given rise to significant issues like hate speech, cyberbullying, aggression, and offensive language, negatively impacting societal well-being. These problems can even lead to severe consequences such as suicidal thoughts, affecting the mental health of the victims. Our primary goal is to develop an automated system for the rapid detection of offensive content on social media, facilitating timely interventions and moderation. This research employs various machine learning classifiers, utilizing character N-gram TF-IDF features. Additionally, we introduce SVM, RL, and Convolutional Neural Network (CNN) models specifically designed for hate speech detection. SVM utilizes character Ngram TF-IDF features, while CNN employs word embedding features. Through extensive experiments, we achieved optimal results, with a weighted F1-score of 0.77 in identifying hate speech and offensive language.
Anthology ID:
2024.dravidianlangtech-1.14
Volume:
Proceedings of the Fourth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages
Month:
March
Year:
2024
Address:
St. Julian's, Malta
Editors:
Bharathi Raja Chakravarthi, Ruba Priyadharshini, Anand Kumar Madasamy, Sajeetha Thavareesan, Elizabeth Sherly, Rajeswari Nadarajan, Manikandan Ravikiran
Venues:
DravidianLangTech | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
91–95
Language:
URL:
https://aclanthology.org/2024.dravidianlangtech-1.14
DOI:
Bibkey:
Cite (ACL):
Selam Abitte Kanta, Grigori Sidorov, and Alexander Gelbukh. 2024. Selam@DravidianLangTech 2024:Identifying Hate Speech and Offensive Language. In Proceedings of the Fourth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages, pages 91–95, St. Julian's, Malta. Association for Computational Linguistics.
Cite (Informal):
Selam@DravidianLangTech 2024:Identifying Hate Speech and Offensive Language (Abitte Kanta et al., DravidianLangTech-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.dravidianlangtech-1.14.pdf