Augmenting Reddit Posts to Determine Wellness Dimensions impacting Mental Health

Chandreen Liyanage, Muskan Garg, Vijay Mago, Sunghwan Sohn


Abstract
Amid ongoing health crisis, there is a growing necessity to discern possible signs of Wellness Dimensions (WD) manifested in self-narrated text. As the distribution of WD on social media data is intrinsically imbalanced, we experiment the generative AI techniques for data augmentation to enable further improvement in the pre-screening task of classifying WD. To this end, we propose a simple yet effective data augmentation approach through prompt-based Generative AI models, and evaluate the ROUGE scores and syntactic/ semantic similarity among existing interpretations and augmented data. Our approach with ChatGPT model surpasses all the other methods and achieves improvement over baselines such as Easy-Data Augmentation (EDA) and Backtranslation (BT).
Anthology ID:
2023.bionlp-1.27
Volume:
The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Dina Demner-fushman, Sophia Ananiadou, Kevin Cohen
Venue:
BioNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
306–312
Language:
URL:
https://aclanthology.org/2023.bionlp-1.27
DOI:
10.18653/v1/2023.bionlp-1.27
Bibkey:
Cite (ACL):
Chandreen Liyanage, Muskan Garg, Vijay Mago, and Sunghwan Sohn. 2023. Augmenting Reddit Posts to Determine Wellness Dimensions impacting Mental Health. In The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks, pages 306–312, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Augmenting Reddit Posts to Determine Wellness Dimensions impacting Mental Health (Liyanage et al., BioNLP 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.bionlp-1.27.pdf