Maoqin @ DravidianLangTech-EACL2021: The Application of Transformer-Based Model

Maoqin Yang


Abstract
This paper describes the result of team-Maoqin at DravidianLangTech-EACL2021. The provided task consists of three languages(Tamil, Malayalam, and Kannada), I only participate in one of the language task-Malayalam. The goal of this task is to identify offensive language content of the code-mixed dataset of comments/posts in Dravidian Languages (Tamil-English, Malayalam-English, and Kannada-English) collected from social media. This is a classification task at the comment/post level. Given a Youtube comment, systems have to classify it into Not-offensive, Offensive-untargeted, Offensive-targeted-individual, Offensive-targeted-group, Offensive-targeted-other, or Not-in-indented-language. I use the transformer-based language model with BiGRU-Attention to complete this task. To prove the validity of the model, I also use some other neural network models for comparison. And finally, the team ranks 5th in this task with a weighted average F1 score of 0.93 on the private leader board.
Anthology ID:
2021.dravidianlangtech-1.40
Volume:
Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages
Month:
April
Year:
2021
Address:
Kyiv
Editors:
Bharathi Raja Chakravarthi, Ruba Priyadharshini, Anand Kumar M, Parameswari Krishnamurthy, Elizabeth Sherly
Venue:
DravidianLangTech
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
281–286
Language:
URL:
https://aclanthology.org/2021.dravidianlangtech-1.40
DOI:
Bibkey:
Cite (ACL):
Maoqin Yang. 2021. Maoqin @ DravidianLangTech-EACL2021: The Application of Transformer-Based Model. In Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages, pages 281–286, Kyiv. Association for Computational Linguistics.
Cite (Informal):
Maoqin @ DravidianLangTech-EACL2021: The Application of Transformer-Based Model (Yang, DravidianLangTech 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.dravidianlangtech-1.40.pdf
Software:
 2021.dravidianlangtech-1.40.Software.zip
Dataset:
 2021.dravidianlangtech-1.40.Dataset.zip