2025
pdf
bib
abs
Detecting Changes in Mental Health Status via Reddit Posts in Response to Global Negative Events
Zenan Chen
|
Judita Preiss
|
Peter A. Bath
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era
Detecting population-level mental health responses to global negative events through social media language remains understudied, despite its potential for public health surveillance. While pretrained language models (PLMs) have shown promise in mental health detection, their effectiveness in capturing event-driven collective psychological shifts – especially across diverse crisis contexts – is unclear. We present a prototype evaluation of three PLMs for identifying population mental health dynamics triggered by real-world negative events. We introduce two novel datasets specifically designed for this task. Our findings suggest that DistilBERT is better suited to the noisier global negative events data, while MentalRoBERTa shows the validity of the method on the Covid-19 tidier data. SHAP interpretability analysis of 500 randomly sampled posts revealed that mental-health related vocabulary (anxiety, depression, worthless) emerged as the most influential linguistic markers for mental health classification.
pdf
bib
abs
A Multi-Baseline Framework for Ranking Global Event Significance Using Google Trends and Large Language Models
Zenan Chen
Proceedings of the 9th Student Research Workshop associated with the International Conference Recent Advances in Natural Language Processing
Determining global event significance lacks standardized metrics for quantifying worldwide impact. While Google Trends has demonstrated utility in domain-specific studies, its application to global event ranking remains limited. This paper presents a framework combining Google Trends data with large language models for automated global event ranking. This study leverages Command R+ and Llama 3.3-70B-Instruct to generate contextually relevant event keywords and establishes significance through comparative search volume analysis against baseline reference terms, incorporating temporal weighting mechanisms to address chronological biases. The proposed methodology identified globally significant events across technology, health, sports, and natural disasters from a dataset of 1,094 events (2020-2024) extracted from Wikipedia.
2024
pdf
bib
abs
Incorporating Word Count Information into Depression Risk Summary Generation: INF@UoS CLPsych 2024 Submission
Judita Preiss
|
Zenan Chen
Proceedings of the 9th Workshop on Computational Linguistics and Clinical Psychology (CLPsych 2024)
Large language model classifiers do not directly offer transparency: it is not clear why one class is chosen over another. In this work, summaries explaining the suicide risk level assigned using a fine-tuned mental-roberta-base model are generated from key phrases extracted using SHAP explainability using Mistral-7B. The training data for the classifier consists of all Reddit posts of a user in the University of Maryland Reddit Suicidality Dataset, Version 2, with their suicide risk labels along with selected features extracted from each post by the Linguistic Inquiry and Word Count (LIWC-22) tool. The resulting model is used to make predictions regarding risk on each post of the users in the evaluation set of the CLPsych 2024 shared task, with a SHAP explainer used to identify the phrases contributing to the top scoring, correct and severe risk categories. Some basic stoplisting is applied to the extracted phrases, along with length based filtering, and a locally run version of Mistral-7B-Instruct-v0.1 is used to create summaries from the highest value (based on SHAP) phrases.