SJ_AJ@DravidianLangTech-EACL2021: Task-Adaptive Pre-Training of Multilingual BERT models for Offensive Language Identification

Sai Muralidhar Jayanthi, Akshat Gupta


Abstract
In this paper we present our submission for the EACL 2021-Shared Task on Offensive Language Identification in Dravidian languages. Our final system is an ensemble of mBERT and XLM-RoBERTa models which leverage task-adaptive pre-training of multilingual BERT models with a masked language modeling objective. Our system was ranked 1st for Kannada, 2nd for Malayalam and 3rd for Tamil.
Anthology ID:
2021.dravidianlangtech-1.44
Volume:
Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages
Month:
April
Year:
2021
Address:
Kyiv
Venues:
DravidianLangTech | EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
307–312
Language:
URL:
https://aclanthology.org/2021.dravidianlangtech-1.44
DOI:
Bibkey:
Cite (ACL):
Sai Muralidhar Jayanthi and Akshat Gupta. 2021. SJ_AJ@DravidianLangTech-EACL2021: Task-Adaptive Pre-Training of Multilingual BERT models for Offensive Language Identification. In Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages, pages 307–312, Kyiv. Association for Computational Linguistics.
Cite (Informal):
SJ_AJ@DravidianLangTech-EACL2021: Task-Adaptive Pre-Training of Multilingual BERT models for Offensive Language Identification (Jayanthi & Gupta, DravidianLangTech 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.dravidianlangtech-1.44.pdf
Software:
 2021.dravidianlangtech-1.44.Software.zip
Code
 murali1996/eacl2021-OffensEval-Dravidian