Avantika Balaji
2021
ssn_diBERTsity@LT-EDI-EACL2021:Hope Speech Detection on multilingual YouTube comments via transformer based approach
Arunima S
|
Akshay Ramakrishnan
|
Avantika Balaji
|
Thenmozhi D.
|
Senthil Kumar B
Proceedings of the First Workshop on Language Technology for Equality, Diversity and Inclusion
In recent times, there exists an abundance of research to classify abusive and offensive texts focusing on negative comments but only minimal research using the positive reinforcement approach. The task was aimed at classifying texts into ‘Hope_speech’, ‘Non_hope_speech’, and ‘Not in language’. The datasets were provided by the LT-EDI organisers in English, Tamil, and Malayalam language with texts sourced from YouTube comments. We trained our data using transformer models, specifically mBERT for Tamil and Malayalam and BERT for English, and achieved weighted average F1-scores of 0.46, 0.81, 0.92 for Tamil, Malayalam, and English respectively.
Search