Supernova@DravidianLangTech 2023@Abusive Comment Detection in Tamil and Telugu - (Tamil, Tamil-English, Telugu-English)

Ankitha Reddy, Pranav Moorthi, Ann Maria Thomas


Abstract
This paper focuses on using Support Vector Machines (SVM) classifiers with TF-IDF feature extraction to classify whether a comment is abusive or not.The paper tries to identify abusive content in regional languages.The dataset analysis presents the distribution of target variables in the Tamil-English, Telugu-English, and Tamil datasets.The methodology section describes the preprocessing steps, including consistency, removal of special characters and emojis, removal of stop words, and stemming of data. Overall, the study contributes to the field of abusive comment detection in Tamil and Telugu languages.
Anthology ID:
2023.dravidianlangtech-1.32
Volume:
Proceedings of the Third Workshop on Speech and Language Technologies for Dravidian Languages
Month:
September
Year:
2023
Address:
Varna, Bulgaria
Editors:
Bharathi R. Chakravarthi, Ruba Priyadharshini, Anand Kumar M, Sajeetha Thavareesan, Elizabeth Sherly
Venues:
DravidianLangTech | WS
SIG:
Publisher:
INCOMA Ltd., Shoumen, Bulgaria
Note:
Pages:
225–230
Language:
URL:
https://aclanthology.org/2023.dravidianlangtech-1.32
DOI:
Bibkey:
Cite (ACL):
Ankitha Reddy, Pranav Moorthi, and Ann Maria Thomas. 2023. Supernova@DravidianLangTech 2023@Abusive Comment Detection in Tamil and Telugu - (Tamil, Tamil-English, Telugu-English). In Proceedings of the Third Workshop on Speech and Language Technologies for Dravidian Languages, pages 225–230, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
Cite (Informal):
Supernova@DravidianLangTech 2023@Abusive Comment Detection in Tamil and Telugu - (Tamil, Tamil-English, Telugu-English) (Reddy et al., DravidianLangTech-WS 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.dravidianlangtech-1.32.pdf