Power doesn’t reside in size: A Low Parameter Hybrid Language Model (HLM) for Sentiment Analysis in Code-mixed data

Pavan Sai Balaga; Nagasamudram Karthik; Challa Vishwanath; Raksha Sharma; Rudra Murthy; Ashish Mittal

Power doesn’t reside in size: A Low Parameter Hybrid Language Model (HLM) for Sentiment Analysis in Code-mixed data

Pavan Sai Balaga, Nagasamudram Karthik, Challa Vishwanath, Raksha Sharma, Rudra Murthy, Ashish Mittal

Abstract

Code-mixed text—where multiple languages are used within the same utterance—is increasingly common in both spoken and written communication. However, it presents significant challenges for machine learning models due to the interplay of distinct grammatical structures, effectively forming a hybrid language. While fine-tuning large language models (LLMs) such as GPT-3, or Llama-3 on code-mixed data has led to performance improvements, these models still lag behind their monolingual counterparts and incur high computational costs due to the large number of trainable parameters.In this paper, we focus on the task of sentiment detection in code-mixed text and propose a Hybrid Language Model (HLM) that combines a multilingual encoder (e.g., mBERT) with a lightweight decoder (e.g., Sarvam-1) (3B parameters). Despite having significantly fewer trainable parameters, HLM achieves sentiment classification performance comparable to that of fine-tuned Large Language Models (LLMs) (> 7B parameters). Furthermore, our results demonstrate that HLM significantly outperforms models trained individually, underscoring its effectiveness for low-resource, code-mixed sentiment analysis.

Anthology ID:: 2025.emnlp-main.749
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 14819–14827
Language:
URL:: https://aclanthology.org/2025.emnlp-main.749/
DOI:
Bibkey:
Cite (ACL):: Pavan Sai Balaga, Nagasamudram Karthik, Challa Vishwanath, Raksha Sharma, Rudra Murthy, and Ashish Mittal. 2025. Power doesn’t reside in size: A Low Parameter Hybrid Language Model (HLM) for Sentiment Analysis in Code-mixed data. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 14819–14827, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Power doesn’t reside in size: A Low Parameter Hybrid Language Model (HLM) for Sentiment Analysis in Code-mixed data (Balaga et al., EMNLP 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.emnlp-main.749.pdf
Checklist:: 2025.emnlp-main.749.checklist.pdf

PDF Cite Search Checklist Fix data