Chandreen Liyanage


2023

pdf bib
Augmenting Reddit Posts to Determine Wellness Dimensions impacting Mental Health
Chandreen Liyanage | Muskan Garg | Vijay Mago | Sunghwan Sohn
The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks

Amid ongoing health crisis, there is a growing necessity to discern possible signs of Wellness Dimensions (WD) manifested in self-narrated text. As the distribution of WD on social media data is intrinsically imbalanced, we experiment the generative AI techniques for data augmentation to enable further improvement in the pre-screening task of classifying WD. To this end, we propose a simple yet effective data augmentation approach through prompt-based Generative AI models, and evaluate the ROUGE scores and syntactic/ semantic similarity among existing interpretations and augmented data. Our approach with ChatGPT model surpasses all the other methods and achieves improvement over baselines such as Easy-Data Augmentation (EDA) and Backtranslation (BT).