Sentiment Analysis of Dravidian Code Mixed Data

Asrita Venkata Mandalam; Yashvardhan Sharma

Sentiment Analysis of Dravidian Code Mixed Data

Asrita Venkata Mandalam, Yashvardhan Sharma

Abstract

This paper presents the methodologies implemented while classifying Dravidian code-mixed comments according to their polarity. With datasets of code-mixed Tamil and Malayalam available, three methods are proposed - a sub-word level model, a word embedding based model and a machine learning based architecture. The sub-word and word embedding based models utilized Long Short Term Memory (LSTM) network along with language-specific preprocessing while the machine learning model used term frequency–inverse document frequency (TF-IDF) vectorization along with a Logistic Regression model. The sub-word level model was submitted to the the track ‘Sentiment Analysis for Dravidian Languages in Code-Mixed Text’ proposed by Forum of Information Retrieval Evaluation in 2020 (FIRE 2020). Although it received a rank of 5 and 12 for the Tamil and Malayalam tasks respectively in the FIRE 2020 track, this paper improves upon the results by a margin to attain final weighted F1-scores of 0.65 for the Tamil task and 0.68 for the Malayalam task. The former score is equivalent to that attained by the highest ranked team of the Tamil track.

Anthology ID:: 2021.dravidianlangtech-1.6
Volume:: Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages
Month:: April
Year:: 2021
Address:: Kyiv
Editors:: Bharathi Raja Chakravarthi, Ruba Priyadharshini, Anand Kumar M, Parameswari Krishnamurthy, Elizabeth Sherly
Venue:: DravidianLangTech
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 46–54
Language:
URL:: https://aclanthology.org/2021.dravidianlangtech-1.6/
DOI:
Bibkey:
Cite (ACL):: Asrita Venkata Mandalam and Yashvardhan Sharma. 2021. Sentiment Analysis of Dravidian Code Mixed Data. In Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages, pages 46–54, Kyiv. Association for Computational Linguistics.
Cite (Informal):: Sentiment Analysis of Dravidian Code Mixed Data (Mandalam & Sharma, DravidianLangTech 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.dravidianlangtech-1.6.pdf
Software:: 2021.dravidianlangtech-1.6.Software.zip

PDF Cite Search Software Fix data