Incorporating Word Count Information into Depression Risk Summary Generation: INF@UoS CLPsych 2024 Submission

Judita Preiss, Zenan Chen


Abstract
Large language model classifiers do not directly offer transparency: it is not clear why one class is chosen over another. In this work, summaries explaining the suicide risk level assigned using a fine-tuned mental-roberta-base model are generated from key phrases extracted using SHAP explainability using Mistral-7B. The training data for the classifier consists of all Reddit posts of a user in the University of Maryland Reddit Suicidality Dataset, Version 2, with their suicide risk labels along with selected features extracted from each post by the Linguistic Inquiry and Word Count (LIWC-22) tool. The resulting model is used to make predictions regarding risk on each post of the users in the evaluation set of the CLPsych 2024 shared task, with a SHAP explainer used to identify the phrases contributing to the top scoring, correct and severe risk categories. Some basic stoplisting is applied to the extracted phrases, along with length based filtering, and a locally run version of Mistral-7B-Instruct-v0.1 is used to create summaries from the highest value (based on SHAP) phrases.
Anthology ID:
2024.clpsych-1.19
Volume:
Proceedings of the 9th Workshop on Computational Linguistics and Clinical Psychology (CLPsych 2024)
Month:
March
Year:
2024
Address:
St. Julians, Malta
Editors:
Andrew Yates, Bart Desmet, Emily Prud’hommeaux, Ayah Zirikly, Steven Bedrick, Sean MacAvaney, Kfir Bar, Molly Ireland, Yaakov Ophir
Venues:
CLPsych | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
211–217
Language:
URL:
https://aclanthology.org/2024.clpsych-1.19
DOI:
Bibkey:
Cite (ACL):
Judita Preiss and Zenan Chen. 2024. Incorporating Word Count Information into Depression Risk Summary Generation: INF@UoS CLPsych 2024 Submission. In Proceedings of the 9th Workshop on Computational Linguistics and Clinical Psychology (CLPsych 2024), pages 211–217, St. Julians, Malta. Association for Computational Linguistics.
Cite (Informal):
Incorporating Word Count Information into Depression Risk Summary Generation: INF@UoS CLPsych 2024 Submission (Preiss & Chen, CLPsych-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.clpsych-1.19.pdf