Knowdee at BLP-2023 Task 2: Improving Bangla Sentiment Analysis Using Ensembled Models with Pseudo-Labeling

Xiaoyi Liu; Mao Teng; Shuangtao Yang; Bo Fu

doi:10.18653/v1/2023.banglalp-1.35

Knowdee at BLP-2023 Task 2: Improving Bangla Sentiment Analysis Using Ensembled Models with Pseudo-Labeling

Xiaoyi Liu, Mao Teng, SHuangtao Yang, Bo Fu

Abstract

This paper outlines our submission to the Sentiment Analysis Shared Task at the Bangla Language Processing (BLP) Workshop at EMNLP2023 (Hasan et al., 2023a). The objective of this task is to detect sentiment in each text by classifying it as Positive, Negative, or Neutral. This shared task is based on the MUltiplatform BAngla SEntiment (MUBASE) (Hasan et al., 2023b) and SentNob (Islam et al., 2021) dataset, which consists of public comments from various social media platforms. Our proposed method for this task is based on the pre-trained Bangla language model BanglaBERT (Bhattacharjee et al., 2022). We trained an ensemble of BanglaBERT on the original dataset and used it to generate pseudo-labels for data augmentation. This expanded dataset was then used to train our final models. During the evaluation phase, 30 teams submitted their systems, and our system achieved the second highest performance with F1 score of 0.7267. The source code of the proposed approach is available at https://github.com/KnowdeeAI/blp_task2_knowdee.git.

Anthology ID:: 2023.banglalp-1.35
Volume:: Proceedings of the First Workshop on Bangla Language Processing (BLP-2023)
Month:: December
Year:: 2023
Address:: Singapore
Editors:: Firoj Alam, Sudipta Kar, Shammur Absar Chowdhury, Farig Sadeque, Ruhul Amin
Venue:: BanglaLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 273–278
Language:
URL:: https://aclanthology.org/2023.banglalp-1.35/
DOI:: 10.18653/v1/2023.banglalp-1.35
Bibkey:
Cite (ACL):: Xiaoyi Liu, Mao Teng, SHuangtao Yang, and Bo Fu. 2023. Knowdee at BLP-2023 Task 2: Improving Bangla Sentiment Analysis Using Ensembled Models with Pseudo-Labeling. In Proceedings of the First Workshop on Bangla Language Processing (BLP-2023), pages 273–278, Singapore. Association for Computational Linguistics.
Cite (Informal):: Knowdee at BLP-2023 Task 2: Improving Bangla Sentiment Analysis Using Ensembled Models with Pseudo-Labeling (Liu et al., BanglaLP 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.banglalp-1.35.pdf

PDF Cite Search Fix data