ssn_diBERTsity@LT-EDI-EACL2021:Hope Speech Detection on multilingual YouTube comments via transformer based approach

Arunima S, Akshay Ramakrishnan, Avantika Balaji, Thenmozhi D., Senthil Kumar B


Abstract
In recent times, there exists an abundance of research to classify abusive and offensive texts focusing on negative comments but only minimal research using the positive reinforcement approach. The task was aimed at classifying texts into ‘Hope_speech’, ‘Non_hope_speech’, and ‘Not in language’. The datasets were provided by the LT-EDI organisers in English, Tamil, and Malayalam language with texts sourced from YouTube comments. We trained our data using transformer models, specifically mBERT for Tamil and Malayalam and BERT for English, and achieved weighted average F1-scores of 0.46, 0.81, 0.92 for Tamil, Malayalam, and English respectively.
Anthology ID:
2021.ltedi-1.12
Volume:
Proceedings of the First Workshop on Language Technology for Equality, Diversity and Inclusion
Month:
April
Year:
2021
Address:
Kyiv
Editors:
Bharathi Raja Chakravarthi, John P. McCrae, Manel Zarrouk, Kalika Bali, Paul Buitelaar
Venue:
LTEDI
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
92–97
Language:
URL:
https://aclanthology.org/2021.ltedi-1.12
DOI:
Bibkey:
Cite (ACL):
Arunima S, Akshay Ramakrishnan, Avantika Balaji, Thenmozhi D., and Senthil Kumar B. 2021. ssn_diBERTsity@LT-EDI-EACL2021:Hope Speech Detection on multilingual YouTube comments via transformer based approach. In Proceedings of the First Workshop on Language Technology for Equality, Diversity and Inclusion, pages 92–97, Kyiv. Association for Computational Linguistics.
Cite (Informal):
ssn_diBERTsity@LT-EDI-EACL2021:Hope Speech Detection on multilingual YouTube comments via transformer based approach (S et al., LTEDI 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.ltedi-1.12.pdf
Software:
 2021.ltedi-1.12.Software.zip