The Best of Both Worlds: Combining Engineered Features with Transformers for Improved Mental Health Prediction from Reddit Posts

Sourabh Zanwar, Daniel Wiechmann, Yu Qiao, Elma Kerz


Abstract
In recent years, there has been increasing interest in the application of natural language processing and machine learning techniques to the detection of mental health conditions (MHC) based on social media data. In this paper, we aim to improve the state-of-the-art (SoTA) detection of six MHC in Reddit posts in two ways: First, we built models leveraging Bidirectional Long Short-Term Memory (BLSTM) networks trained on in-text distributions of a comprehensive set of psycholinguistic features for more explainable MHC detection as compared to black-box solutions. Second, we combine these BLSTM models with Transformers to improve the prediction accuracy over SoTA models. In addition, we uncover nuanced patterns of linguistic markers characteristic of specific MHC.
Anthology ID:
2022.smm4h-1.50
Volume:
Proceedings of The Seventh Workshop on Social Media Mining for Health Applications, Workshop & Shared Task
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Venue:
SMM4H
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
197–202
Language:
URL:
https://aclanthology.org/2022.smm4h-1.50
DOI:
Bibkey:
Cite (ACL):
Sourabh Zanwar, Daniel Wiechmann, Yu Qiao, and Elma Kerz. 2022. The Best of Both Worlds: Combining Engineered Features with Transformers for Improved Mental Health Prediction from Reddit Posts. In Proceedings of The Seventh Workshop on Social Media Mining for Health Applications, Workshop & Shared Task, pages 197–202, Gyeongju, Republic of Korea. Association for Computational Linguistics.
Cite (Informal):
The Best of Both Worlds: Combining Engineered Features with Transformers for Improved Mental Health Prediction from Reddit Posts (Zanwar et al., SMM4H 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.smm4h-1.50.pdf
Data
DreadditSMHD