Sudev Basti


2024

pdf bib
A Machine Learning Framework for Detecting Hate Speech and Fake Narratives in Hindi-English Tweets
R.n. Yadawad | Sunil Saumya | K.n. Nivedh | Siddhaling S. Padanur | Sudev Basti
Proceedings of the 21st International Conference on Natural Language Processing (ICON): Shared Task on Decoding Fake Narratives in Spreading Hateful Stories (Faux-Hate)

This paper presents a novel system developed for the Faux-Hate Shared Task at ICON2024, addressing the detection of hate speechand fake narratives within Hindi-English code-mixed social media data. Our approach com-bines advanced text preprocessing, TF-IDFvectorization, and Random Forest classifiersto identify harmful content, while employingSMOTE to address class imbalance. By lever-aging ensemble learning and feature engineer-ing, our system demonstrates robust perfor-mance in detecting hateful and fake content,classifying targets, and evaluating the sever-ity of hate speech. The results underscore thepotential for real-world applications, such asmoderating online platforms and identifyingharmful narratives. Furthermore, we highlightethical considerations for deploying such tools,emphasizing responsible use in sensitive do-mains, thereby advancing research in multilin-gual hate speech detection and online abusemitigation.