DILAB at #SMM4H 2024: RoBERTa Ensemble for Identifying Children’s Medical Disorders in English Tweets

Azmine Toushik Wasi, Sheikh Rahman


Abstract
This paper details our system developed for the 9th Social Media Mining for Health Research and Applications Workshop (SMM4H 2024), addressing Task 5 focused on binary classification of English tweets reporting children’s medical disorders. Our objective was to enhance the detection of tweets related to children’s medical issues. To do this, we use various pre-trained language models, like RoBERTa and BERT. We fine-tuned these models on the task-specific dataset, adjusting model layers and hyperparameters in an attempt to optimize performance. As we observe unstable fluctuations in performance metrics during training, we implement an ensemble approach that combines predictions from different learning epochs. Our model achieves promising results, with the best-performing configuration achieving F1 score of 93.8% on the validation set and 89.8% on the test set.
Anthology ID:
2024.smm4h-1.3
Volume:
Proceedings of The 9th Social Media Mining for Health Research and Applications (SMM4H 2024) Workshop and Shared Tasks
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Dongfang Xu, Graciela Gonzalez-Hernandez
Venues:
SMM4H | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
10–12
Language:
URL:
https://aclanthology.org/2024.smm4h-1.3
DOI:
Bibkey:
Cite (ACL):
Azmine Toushik Wasi and Sheikh Rahman. 2024. DILAB at #SMM4H 2024: RoBERTa Ensemble for Identifying Children’s Medical Disorders in English Tweets. In Proceedings of The 9th Social Media Mining for Health Research and Applications (SMM4H 2024) Workshop and Shared Tasks, pages 10–12, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
DILAB at #SMM4H 2024: RoBERTa Ensemble for Identifying Children’s Medical Disorders in English Tweets (Wasi & Rahman, SMM4H-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.smm4h-1.3.pdf