PCIC at SMM4H 2024: Enhancing Reddit Post Classification on Social Anxiety Using Transformer Models and Advanced Loss Functions

Leon Hecht, Victor Pozos, Helena Gomez Adorno, Gibran Fuentes-Pineda, Gerardo Sierra, Gemma Bel-Enguix


Abstract
We present our approach to solving the task of identifying the effect of outdoor activities on social anxiety based on reddit posts. We employed state-of-the-art transformer models enhanced with a combination of advanced loss functions. Data augmentation techniques were also used to address class imbalance within the training set. Our method achieved a macro-averaged F1-score of 0.655 on the test data, surpassing the workshop’s mean F1-Score of 0.519. These findings suggest that integrating weighted loss functions improves the performance of transformer models in classifying unbalanced text data, while data augmentation can improve the model’s ability to generalize.
Anthology ID:
2024.smm4h-1.14
Volume:
Proceedings of The 9th Social Media Mining for Health Research and Applications (SMM4H 2024) Workshop and Shared Tasks
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Dongfang Xu, Graciela Gonzalez-Hernandez
Venues:
SMM4H | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
63–66
Language:
URL:
https://aclanthology.org/2024.smm4h-1.14
DOI:
Bibkey:
Cite (ACL):
Leon Hecht, Victor Pozos, Helena Gomez Adorno, Gibran Fuentes-Pineda, Gerardo Sierra, and Gemma Bel-Enguix. 2024. PCIC at SMM4H 2024: Enhancing Reddit Post Classification on Social Anxiety Using Transformer Models and Advanced Loss Functions. In Proceedings of The 9th Social Media Mining for Health Research and Applications (SMM4H 2024) Workshop and Shared Tasks, pages 63–66, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
PCIC at SMM4H 2024: Enhancing Reddit Post Classification on Social Anxiety Using Transformer Models and Advanced Loss Functions (Hecht et al., SMM4H-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.smm4h-1.14.pdf