Sneha D’silva

2024

pdf bib abs
Still Not Quite There! Evaluating Large Language Models for Comorbid Mental Health Diagnosis
Amey Hengle | Atharva Kulkarni | Shantanu Deepak Patankar | Madhumitha Chandrasekaran | Sneha D’silva | Jemima S. Jacob | Rashmi Gupta
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

In this study, we introduce ANGST, a novel, first of its kind benchmark for depression-anxiety comorbidity classification from social media posts. Unlike contemporary datasets that often oversimplify the intricate interplay between different mental health disorders by treating them as isolated conditions, ANGST enables multi-label classification, allowing each post to be simultaneously identified as indicating depression and/or anxiety. Comprising 2876 meticulously annotated posts by expert psychologists and an additional 7667 silver-labeled posts, ANGST posits a more representative sample of online mental health discourse. Moreover, we benchmark ANGST using various state-of-the-art language models, ranging from Mental-BERT to GPT-4. Our results provide significant insights into the capabilities and limitations of these models in complex diagnostic scenarios. While GPT-4 generally outperforms other models, none achieve an F1 score exceeding 72% in multi-class comorbid classification, underscoring the ongoing challenges in applying language models to mental health diagnostics.

Co-authors

Shantanu Deepak Patankar 1

Venues

emnlp1

Fix author