MUCS@DravidianLangTech-EACL2021:COOLI-Code-Mixing Offensive Language Identification

Fazlourrahman Balouchzahi, Aparna B K, H L Shashirekha


Abstract
This paper describes the models submitted by the team MUCS for Offensive Language Identification in Dravidian Languages-EACL 2021 shared task that aims at identifying and classifying code-mixed texts of three language pairs namely, Kannada-English (Kn-En), Malayalam-English (Ma-En), and Tamil-English (Ta-En) into six predefined categories (5 categories in Ma-En language pair). Two models, namely, COOLI-Ensemble and COOLI-Keras are trained with the char sequences extracted from the sentences combined with words as features. Out of the two proposed models, COOLI-Ensemble model (best among our models) obtained first rank for Ma-En language pair with 0.97 weighted F1-score and fourth and sixth ranks with 0.75 and 0.69 weighted F1-score for Ta-En and Kn-En language pairs respectively.
Anthology ID:
2021.dravidianlangtech-1.47
Volume:
Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages
Month:
April
Year:
2021
Address:
Kyiv
Venues:
DravidianLangTech | EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
323–329
Language:
URL:
https://aclanthology.org/2021.dravidianlangtech-1.47
DOI:
Bibkey:
Cite (ACL):
Fazlourrahman Balouchzahi, Aparna B K, and H L Shashirekha. 2021. MUCS@DravidianLangTech-EACL2021:COOLI-Code-Mixing Offensive Language Identification. In Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages, pages 323–329, Kyiv. Association for Computational Linguistics.
Cite (Informal):
MUCS@DravidianLangTech-EACL2021:COOLI-Code-Mixing Offensive Language Identification (Balouchzahi et al., DravidianLangTech 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.dravidianlangtech-1.47.pdf
Software:
 2021.dravidianlangtech-1.47.Software.zip