Jaloliddin Rajabov


2024

pdf bib
UzABSA: Aspect-Based Sentiment Analysis for the Uzbek Language
Sanatbek Gayratovich Matlatipov | Jaloliddin Rajabov | Elmurod Kuriyozov | Mersaid Aripov
Proceedings of the 3rd Annual Meeting of the Special Interest Group on Under-resourced Languages @ LREC-COLING 2024

The objective of enhancing the availability of natural language processing technologies for low-resource languages has significant importance in facilitating technological accessibility within the populations of speakers of these languages. Our current grasping shows that there are no established linguistic resources available open source to develop aspect-based sentiment analysis (ABSA) tools tailored to the Uzbek language. This work aims to address the aforementioned gap by presenting the first high-quality annotated ABSA dataset - UzABSA. The data used in this study was obtained from a compilation of online reviews of Uzbek restaurants. Consequently, the constructed dataset has a length of 3500 reviews at the document level and 6100+ sentences at the sentence level. The popular approach to language resources of this kind explores four distinctive characteristics, namely Aspect Terms, Aspect Term Polarities, Aspect Category Terms, as well as Aspect Category Polarities. To the best of our knowledge, it is the first and the largest ABSA dataset for the Uzbek language. To evaluate the annotation process of our dataset, we used established statistical techniques such as Cohen’s kappa coefficient and Krippendorff’s 𝛼 to assess agreement between annotators. Subsequently, a classification model, namely K-Nearest Neighbour (KNN), was used to evaluate the performance of the created dataset. Both sets of evaluation techniques demonstrate comparable levels of accuracy. The first findings across the various tasks showed promising outcomes, with accuracy rates ranging from 72% to 88%. This study not only highlights the significance of our acquired dataset but also plays a valuable tool for scholars interested in furthering sentiment analysis in the Uzbek language.