Team_Syrax at BLP-2023 Task 1: Data Augmentation and Ensemble Based Approach for Violence Inciting Text Detection in Bangla

Omar Riyad, Trina Chakraborty, Abhishek Dey


Abstract
This paper describes our participation in Task1 (VITD) of BLP Workshop 1 at EMNLP 2023,focused on the detection and categorizationof threats linked to violence, which could po-tentially encourage more violent actions. Ourapproach involves fine-tuning of pre-trainedtransformer models and employing techniqueslike self-training with external data, data aug-mentation through back-translation, and en-semble learning (bagging and majority voting).Notably, self-training improves performancewhen applied to data from external source butnot when applied to the test-set. Our anal-ysis highlights the effectiveness of ensemblemethods and data augmentation techniques inBangla Text Classification. Our system ini-tially scored 0.70450 and ranked 19th amongthe participants but post-competition experi-ments boosted our score to 0.72740.
Anthology ID:
2023.banglalp-1.32
Volume:
Proceedings of the First Workshop on Bangla Language Processing (BLP-2023)
Month:
December
Year:
2023
Address:
Singapore
Editors:
Firoj Alam, Sudipta Kar, Shammur Absar Chowdhury, Farig Sadeque, Ruhul Amin
Venue:
BanglaLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
247–254
Language:
URL:
https://aclanthology.org/2023.banglalp-1.32
DOI:
10.18653/v1/2023.banglalp-1.32
Bibkey:
Cite (ACL):
Omar Riyad, Trina Chakraborty, and Abhishek Dey. 2023. Team_Syrax at BLP-2023 Task 1: Data Augmentation and Ensemble Based Approach for Violence Inciting Text Detection in Bangla. In Proceedings of the First Workshop on Bangla Language Processing (BLP-2023), pages 247–254, Singapore. Association for Computational Linguistics.
Cite (Informal):
Team_Syrax at BLP-2023 Task 1: Data Augmentation and Ensemble Based Approach for Violence Inciting Text Detection in Bangla (Riyad et al., BanglaLP 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.banglalp-1.32.pdf