Task-Specific Pre-Training and Cross Lingual Transfer for Sentiment Analysis in Dravidian Code-Switched Languages

Akshat Gupta, Sai Krishna Rallabandi, Alan W Black


Abstract
Sentiment analysis in Code-Mixed languages has garnered a lot of attention in recent years. It is an important task for social media monitoring and has many applications, as a large chunk of social media data is Code-Mixed. In this paper, we work on the problem of sentiment analysis for Dravidian Code-Switched languages - Tamil-Engish and Malayalam-English, using three different BERT based models. We leverage task-specific pre-training and cross-lingual transfer to improve on previously reported results, with significant improvement for the Tamil-Engish dataset. We also present a multilingual sentiment classification model that has competitive performance on both Tamil-English and Malayalam-English datasets.
Anthology ID:
2021.dravidianlangtech-1.9
Volume:
Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages
Month:
April
Year:
2021
Address:
Kyiv
Venues:
DravidianLangTech | EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
73–79
Language:
URL:
https://aclanthology.org/2021.dravidianlangtech-1.9
DOI:
Bibkey:
Copy Citation:
PDF:
https://aclanthology.org/2021.dravidianlangtech-1.9.pdf
Software:
 2021.dravidianlangtech-1.9.Software.py
Data
SentiMixTweetEval