PsychEthicsBench: Evaluating Large Language Models Against Australian Mental Health Ethics

Yaling Shen; Stephanie Fong; Yiwen Jiang; Zimu Wang; Feilong Tang; Qingyang Xu; Xiangyu Zhao; Zhongxing Xu; Jiahe Liu; Jinpeng Hu; Dominic Dwyer; Zongyuan Ge

PsychEthicsBench: Evaluating Large Language Models Against Australian Mental Health Ethics

Yaling Shen, Stephanie Fong, Yiwen Jiang, Zimu Wang, Feilong Tang, Qingyang Xu, Xiangyu Zhao, Zhongxing Xu, Jiahe Liu, Jinpeng Hu, Dominic Dwyer, Zongyuan Ge

Abstract

The increasing integration of large language models (LLMs) into mental health applications necessitates robust frameworks for evaluating professional safety alignment. Current evaluative approaches primarily rely on refusal-based safety signals, which offer limited insight into the nuanced behaviors required in clinical practice. In mental health, clinically inadequate refusals can be perceived as unempathetic and discourage help-seeking. To address this gap, we move beyond refusal-centric metrics and introduce PsychEthicsBench, the first principle-grounded benchmark based on Australian psychology and psychiatry guidelines, designed to evaluate LLMs’ ethical knowledge and behavioral responses through multiple-choice and open-ended tasks with fine-grained ethicality annotations. Empirical results across 14 models reveal that refusal rates are poor indicators of ethical behavior, revealing a significant divergence between safety triggers and clinical appropriateness. Notably, we find that domain-specific fine-tuning can degrade ethical robustness, as several specialized models underperform their base backbones in ethical alignment. PsychEthicsBench provides a foundation for systematic, jurisdiction-aware evaluation of LLMs in mental health, encouraging more responsible development in this domain.

Anthology ID:: 2026.findings-acl.1971
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 39571–39589
Language:
URL:: https://aclanthology.org/2026.findings-acl.1971/
DOI:
Bibkey:
Cite (ACL):: Yaling Shen, Stephanie Fong, Yiwen Jiang, Zimu Wang, Feilong Tang, Qingyang Xu, Xiangyu Zhao, Zhongxing Xu, Jiahe Liu, Jinpeng Hu, Dominic Dwyer, and Zongyuan Ge. 2026. PsychEthicsBench: Evaluating Large Language Models Against Australian Mental Health Ethics. In Findings of the Association for Computational Linguistics: ACL 2026, pages 39571–39589, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: PsychEthicsBench: Evaluating Large Language Models Against Australian Mental Health Ethics (Shen et al., Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-acl.1971.pdf
Checklist:: 2026.findings-acl.1971.checklist.pdf

PDF Cite Search Checklist Fix data