Hate-Alert@DravidianLangTech-EACL2021: Ensembling strategies for Transformer-based Offensive language Detection

Debjoy Saha, Naman Paharia, Debajit Chakraborty, Punyajoy Saha, Animesh Mukherjee


Abstract
Social media often acts as breeding grounds for different forms of offensive content. For low resource languages like Tamil, the situation is more complex due to the poor performance of multilingual or language-specific models and lack of proper benchmark datasets. Based on this shared task “Offensive Language Identification in Dravidian Languages” at EACL 2021; we present an exhaustive exploration of different transformer models, We also provide a genetic algorithm technique for ensembling different models. Our ensembled models trained separately for each language secured the first position in Tamil, the second position in Kannada, and the first position in Malayalam sub-tasks. The models and codes are provided.
Anthology ID:
2021.dravidianlangtech-1.38
Volume:
Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages
Month:
April
Year:
2021
Address:
Kyiv
Editors:
Bharathi Raja Chakravarthi, Ruba Priyadharshini, Anand Kumar M, Parameswari Krishnamurthy, Elizabeth Sherly
Venue:
DravidianLangTech
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
270–276
Language:
URL:
https://aclanthology.org/2021.dravidianlangtech-1.38
DOI:
Bibkey:
Cite (ACL):
Debjoy Saha, Naman Paharia, Debajit Chakraborty, Punyajoy Saha, and Animesh Mukherjee. 2021. Hate-Alert@DravidianLangTech-EACL2021: Ensembling strategies for Transformer-based Offensive language Detection. In Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages, pages 270–276, Kyiv. Association for Computational Linguistics.
Cite (Informal):
Hate-Alert@DravidianLangTech-EACL2021: Ensembling strategies for Transformer-based Offensive language Detection (Saha et al., DravidianLangTech 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.dravidianlangtech-1.38.pdf
Code
 Debjoy10/Hate-Alert-DravidianLangTech