JustATalentedTeam@DravidianLangTech 2025: A Study of ML and DL approaches for Sentiment Analysis in Code-Mixed Tamil and Tulu Texts

Ponsubash Raj R; Paruvatha Priya B; Bharathi B

doi:10.18653/v1/2025.dravidianlangtech-1.47

JustATalentedTeam@DravidianLangTech 2025: A Study of ML and DL approaches for Sentiment Analysis in Code-Mixed Tamil and Tulu Texts

Ponsubash Raj R, Paruvatha Priya B, Bharathi B

Abstract

The growing prevalence of code-mixed text on social media presents unique challenges for sen- timent analysis, particularly in low-resource languages like Tamil and Tulu. This paper ex- plores sentiment classification in Tamil-English and Tulu-English code-mixed datasets using both machine learning (ML) and deep learn- ing (DL) approaches. The ML model utilizes TF-IDF feature extraction combined with a Logistic Regression classifier, while the DL model employs FastText embeddings and a BiLSTM network enhanced with an attention mechanism. Experimental results reveal that the ML model outperforms the DL model in terms of macro F1-score for both languages. Specifically, for Tamil, the ML model achieves a macro F1-score of 0.46, surpassing the DL model’s score of 0.43. For Tulu, the ML model significantly outperforms the DL model, achiev- ing 0.60 compared to 0.48. This performance disparity is more pronounced in Tulu due to its smaller dataset size of 13,308 samples com- pared to Tamil’s 31,122 samples, highlight- ing the data efficiency of ML models in low- resource settings. The study provides insights into the strengths and limitations of each ap- proach, demonstrating that traditional ML tech- niques remain competitive for code-mixed sen- timent analysis when data is limited. These findings contribute to ongoing research in mul- tilingual NLP and offer practical implications for applications such as social media monitor- ing, customer feedback analysis, and conversa- tional AI in Dravidian languages.

Anthology ID:: 2025.dravidianlangtech-1.47
Volume:: Proceedings of the Fifth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages
Month:: May
Year:: 2025
Address:: Acoma, The Albuquerque Convention Center, Albuquerque, New Mexico
Editors:: Bharathi Raja Chakravarthi, Ruba Priyadharshini, Anand Kumar Madasamy, Sajeetha Thavareesan, Elizabeth Sherly, Saranya Rajiakodi, Balasubramanian Palani, Malliga Subramanian, Subalalitha Cn, Dhivya Chinnappa
Venues:: DravidianLangTech | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 273–277
Language:
URL:: https://aclanthology.org/2025.dravidianlangtech-1.47/
DOI:: 10.18653/v1/2025.dravidianlangtech-1.47
Bibkey:
Cite (ACL):: Ponsubash Raj R, Paruvatha Priya B, and Bharathi B. 2025. JustATalentedTeam@DravidianLangTech 2025: A Study of ML and DL approaches for Sentiment Analysis in Code-Mixed Tamil and Tulu Texts. In Proceedings of the Fifth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages, pages 273–277, Acoma, The Albuquerque Convention Center, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):: JustATalentedTeam@DravidianLangTech 2025: A Study of ML and DL approaches for Sentiment Analysis in Code-Mixed Tamil and Tulu Texts (R et al., DravidianLangTech 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.dravidianlangtech-1.47.pdf

PDF Cite Search Fix data