LT4SG@SMM4H’24: Tweets Classification for Digital Epidemiology of Childhood Health Outcomes Using Pre-Trained Language Models

Dasun Athukoralage, Thushari Atapattu, Menasha Thilakaratne, Katrina Falkner


Abstract
This paper presents our approaches for the SMM4H’24 Shared Task 5 on the binary classification of English tweets reporting children’s medical disorders. Our first approach involves fine-tuning a single RoBERTa-large model, while the second approach entails ensembling the results of three fine-tuned BERTweet-large models. We demonstrate that although both approaches exhibit identical performance on validation data, the BERTweet-large ensemble excels on test data. Our best-performing system achieves an F1-score of 0.938 on test data, outperforming the benchmark classifier by 1.18%.
Anthology ID:
2024.smm4h-1.9
Volume:
Proceedings of The 9th Social Media Mining for Health Research and Applications (SMM4H 2024) Workshop and Shared Tasks
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Dongfang Xu, Graciela Gonzalez-Hernandez
Venues:
SMM4H | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
38–41
Language:
URL:
https://aclanthology.org/2024.smm4h-1.9
DOI:
Bibkey:
Cite (ACL):
Dasun Athukoralage, Thushari Atapattu, Menasha Thilakaratne, and Katrina Falkner. 2024. LT4SG@SMM4H’24: Tweets Classification for Digital Epidemiology of Childhood Health Outcomes Using Pre-Trained Language Models. In Proceedings of The 9th Social Media Mining for Health Research and Applications (SMM4H 2024) Workshop and Shared Tasks, pages 38–41, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
LT4SG@SMM4H’24: Tweets Classification for Digital Epidemiology of Childhood Health Outcomes Using Pre-Trained Language Models (Athukoralage et al., SMM4H-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.smm4h-1.9.pdf