Benchmark Creation for Aspect-Based Sentiment Analysis in Low-Resource Odia Language and Evaluation through Fine-Tuning of Multilingual Models

Lipika Dewangan, Zoyah Afsheen Sayeed, Chandresh Maurya


Abstract
The rapid growth of online product reviews spurs significant interest in Aspect-Based Sentiment Analysis (ABSA), which involves identifying aspect terms and their associated sentiment polarity. While ABSA is widely studied in resource-rich languages like English, Chinese, and Spanish, it remains underexplored in low-resource languages such as Odia. To address this gap, we create a reliable resource for aspect-based sentiment analysis in Odia. The dataset is annotated for two specific tasks: Aspect Term Extraction (ATE) and Aspect Polarity Classification (APC), spanning seven domains and aligned with the SemEval-2014 benchmark. Furthermore, we employ an ensemble data augmentation approach combining back-translation with a fine-tuned T5 paraphrase generation model to enhance the dataset and apply a semantic similarity filter using a Universal Sentence Encoder (USE) to remove low-quality data and ensure a balanced distribution of sample difficulty in the newly augmented dataset. Finally, we validate our dataset by fine-tuning multilingual pre-trained models, XLM-R and IndicBERT, on ATE and APC tasks. Additionally, we use three classical baseline models to evaluate the quality of the proposed dataset for these tasks. We hope the Odia dataset will spur more work for the ABSA task.
Anthology ID:
2025.coling-main.391
Volume:
Proceedings of the 31st International Conference on Computational Linguistics
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5863–5869
Language:
URL:
https://aclanthology.org/2025.coling-main.391/
DOI:
Bibkey:
Cite (ACL):
Lipika Dewangan, Zoyah Afsheen Sayeed, and Chandresh Maurya. 2025. Benchmark Creation for Aspect-Based Sentiment Analysis in Low-Resource Odia Language and Evaluation through Fine-Tuning of Multilingual Models. In Proceedings of the 31st International Conference on Computational Linguistics, pages 5863–5869, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
Benchmark Creation for Aspect-Based Sentiment Analysis in Low-Resource Odia Language and Evaluation through Fine-Tuning of Multilingual Models (Dewangan et al., COLING 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.coling-main.391.pdf