Prashant Manandhar
2024
Profanity and Offensiveness Detection in Nepali Language Using Bi-directional LSTM Models
Abiral Adhikari
|
Prashant Manandhar
|
Reewaj Khanal
|
Samir Wagle
|
Praveen Acharya
|
Bal Krishna Bal
Proceedings of the 21st International Conference on Natural Language Processing (ICON)
Offensive and profane content has been on the rise in Nepali Social Media, which, is very disturbing to users. This is partly due to the absence of proper tools and mechanisms for the Nepali language to deal with profanity and offensive texts. In this work, we attempt to develop a deep learning-based profanity and offensive comments detection tool. We develop a Bi-LSTM (Bidirectional Long Short Term Memory) based model for the classification of Profane and Offensive comments and study different variations of the task. Furthermore, Multilingual BERT embedding and vocab embedding were used among others for an accurate understanding of the intent and decency of the posts. While previous related studies in the Nepali language are more focused on sentiment and offensiveness detection only, our study explores profanity and offensiveness detection as two distinct tasks. Our Bi-LSTM model outputs 87.8% accuracy for