Jonathan Zomick


2021

pdf bib
Automatic Detection and Prediction of Psychiatric Hospitalizations From Social Media Posts
Zhengping Jiang | Jonathan Zomick | Sarah Ita Levitan | Mark Serper | Julia Hirschberg
Proceedings of the Seventh Workshop on Computational Linguistics and Clinical Psychology: Improving Access

We address the problem of predicting psychiatric hospitalizations using linguistic features drawn from social media posts. We formulate this novel task and develop an approach to automatically extract time spans of self-reported psychiatric hospitalizations. Using this dataset, we build predictive models of psychiatric hospitalization, comparing feature sets, user vs. post classification, and comparing model performance using a varying time window of posts. Our best model achieves an F1 of .718 using 7 days of posts. Our results suggest that this is a useful framework for collecting hospitalization data, and that social media data can be leveraged to predict acute psychiatric crises before they occur, potentially saving lives and improving outcomes for individuals with mental illness.

2020

pdf bib
Detection of Mental Health from Reddit via Deep Contextualized Representations
Zhengping Jiang | Sarah Ita Levitan | Jonathan Zomick | Julia Hirschberg
Proceedings of the 11th International Workshop on Health Text Mining and Information Analysis

We address the problem of automatic detection of psychiatric disorders from the linguistic content of social media posts. We build a large scale dataset of Reddit posts from users with eight disorders and a control user group. We extract and analyze linguistic characteristics of posts and identify differences between diagnostic groups. We build strong classification models based on deep contextualized word representations and show that they outperform previously applied statistical models with simple linguistic features by large margins. We compare user-level and post-level classification performance, as well as an ensembled multiclass model.

2019

pdf bib
Linguistic Analysis of Schizophrenia in Reddit Posts
Jonathan Zomick | Sarah Ita Levitan | Mark Serper
Proceedings of the Sixth Workshop on Computational Linguistics and Clinical Psychology

We explore linguistic indicators of schizophrenia in Reddit discussion forums. Schizophrenia (SZ) is a chronic mental disorder that affects a person’s thoughts and behaviors. Identifying and detecting signs of SZ is difficult given that SZ is relatively uncommon, affecting approximately 1% of the US population, and people suffering with SZ often believe that they do not have the disorder. Linguistic abnormalities are a hallmark of SZ and many of the illness’s symptoms are manifested through language. In this paper we leverage the vast amount of data available from social media and use statistical and machine learning approaches to study linguistic characteristics of SZ. We collected and analyzed a large corpus of Reddit posts from users claiming to have received a formal diagnosis of SZ and identified several linguistic features that differentiated these users from a control (CTL) group. We compared these results to other findings on social media linguistic analysis and SZ. We also developed a machine learning classifier to automatically identify self-identified users with SZ on Reddit.