Abhishek Dey
2023
Team_Syrax at BLP-2023 Task 1: Data Augmentation and Ensemble Based Approach for Violence Inciting Text Detection in Bangla
Omar Faruqe Riyad
|
Trina Chakraborty
|
Abhishek Dey
Proceedings of the First Workshop on Bangla Language Processing (BLP-2023)
This paper describes our participation in Task1 (VITD) of BLP Workshop 1 at EMNLP 2023,focused on the detection and categorizationof threats linked to violence, which could po-tentially encourage more violent actions. Ourapproach involves fine-tuning of pre-trainedtransformer models and employing techniqueslike self-training with external data, data aug-mentation through back-translation, and en-semble learning (bagging and majority voting).Notably, self-training improves performancewhen applied to data from external source butnot when applied to the test-set. Our anal-ysis highlights the effectiveness of ensemblemethods and data augmentation techniques inBangla Text Classification. Our system ini-tially scored 0.70450 and ranked 19th amongthe participants but post-competition experi-ments boosted our score to 0.72740.
2020
Lexical Tone Recognition in Mizo using Acoustic-Prosodic Features
Parismita Gogoi
|
Abhishek Dey
|
Wendy Lalhminghlui
|
Priyankoo Sarmah
|
S R Mahadeva Prasanna
Proceedings of the Twelfth Language Resources and Evaluation Conference
Mizo is an under-studied Tibeto-Burman tonal language of the North-East India. Preliminary research findings have confirmed that four distinct tones of Mizo (High, Low, Rising and Falling) appear in the language. In this work, an attempt is made to automatically recognize four phonological tones in Mizo distinctively using acoustic-prosodic parameters as features. Six features computed from Fundamental Frequency (F0) contours are considered and two classifier models based on Support Vector Machine (SVM) & Deep Neural Network (DNN) are implemented for automatic tonerecognition task respectively. The Mizo database consists of 31950 iterations of the four Mizo tones, collected from 19 speakers using trisyllabic phrases. A four-way classification of tones is attempted with a balanced (equal number of iterations per tone category) dataset for each tone of Mizo. it is observed that the DNN based classifier shows comparable performance in correctly recognizing four phonological Mizo tones as of the SVM based classifier.