Social Media Hate and Offensive Speech Detection Using Machine Learning method

Girma Bade, Olga Kolesnikova, Grigori Sidorov, José Oropeza


Abstract
Even though the improper use of social media is increasing nowadays, there is also technology that brings solutions. Here, improperness is posting hate and offensive speech that might harm an individual or group. Hate speech refers to an insult toward an individual or group based on their identities. Spreading it on social media platforms is a serious problem for society. The solution, on the other hand, is the availability of natural language processing(NLP) technology that is capable to detect and handle such problems. This paper presents the detection of social media’s hate and offensive speech in the code-mixed Telugu language. For this, the task and golden standard dataset were provided for us by the shared task organizer (DravidianLangTech@ EACL 2024)1. To this end, we have employed the TF-IDF technique for numeric feature extraction and used a random forest algorithm for modeling hate speech detection. Finally, the developed model was evaluated on the test dataset and achieved 0.492 macro-F1.
Anthology ID:
2024.dravidianlangtech-1.40
Volume:
Proceedings of the Fourth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages
Month:
March
Year:
2024
Address:
St. Julian's, Malta
Editors:
Bharathi Raja Chakravarthi, Ruba Priyadharshini, Anand Kumar Madasamy, Sajeetha Thavareesan, Elizabeth Sherly, Rajeswari Nadarajan, Manikandan Ravikiran
Venues:
DravidianLangTech | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
240–244
Language:
URL:
https://aclanthology.org/2024.dravidianlangtech-1.40
DOI:
Bibkey:
Cite (ACL):
Girma Bade, Olga Kolesnikova, Grigori Sidorov, and José Oropeza. 2024. Social Media Hate and Offensive Speech Detection Using Machine Learning method. In Proceedings of the Fourth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages, pages 240–244, St. Julian's, Malta. Association for Computational Linguistics.
Cite (Informal):
Social Media Hate and Offensive Speech Detection Using Machine Learning method (Bade et al., DravidianLangTech-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.dravidianlangtech-1.40.pdf