Chenhao Zhang
2025
CPsyExam: A Chinese Benchmark for Evaluating Psychology using Examinations
Jiahao Zhao
|
Jingwei Zhu
|
Minghuan Tan
|
Min Yang
|
Renhao Li
|
Yang Di
|
Chenhao Zhang
|
Guancheng Ye
|
Chengming Li
|
Xiping Hu
|
Derek F. Wong
Proceedings of the 31st International Conference on Computational Linguistics
In this paper, we introduce a novel psychological benchmark, CPsyExam, constructed from questions sourced from Chinese examination systems. CPsyExam is designed to prioritize psychological knowledge and case analysis separately, recognizing the significance of applying psychological knowledge to real-world scenarios. We collect 22k questions from 39 psychology-related subjects across four Chinese examination systems. From the pool of 22k questions, we utilize 4k to create the benchmark that offers balanced coverage of subjects and incorporates a diverse range of case analysis techniques. Furthermore, we evaluate a range of existing large language models (LLMs), spanning from open-sourced to proprietary models. Our experiments and analysis demonstrate that CPsyExam serves as an effective benchmark for enhancing the understanding of psychology within LLMs and enables the comparison of LLMs across various granularities.
2024
CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for Chinese Psychological Counseling
Chenhao Zhang
|
Renhao Li
|
Minghuan Tan
|
Min Yang
|
Jingwei Zhu
|
Di Yang
|
Jiahao Zhao
|
Guancheng Ye
|
Chengming Li
|
Xiping Hu
Findings of the Association for Computational Linguistics: ACL 2024
Using large language models (LLMs) to assist psychological counseling is a significant but challenging task at present. Attempts have been made on improving empathetic conversations or acting as effective assistants in the treatment with LLMs. However, the existing datasets lack consulting knowledge, resulting in LLMs lacking professional consulting competence. Moreover, how to automatically evaluate multi-turn dialogues within the counseling process remains an understudied area. To bridge the gap, we propose CPsyCoun, a report-based multi-turn dialogue reconstruction and evaluation framework for Chinese psychological counseling. To fully exploit psychological counseling reports, a two-phase approach is devised to construct high-quality dialogues while a comprehensive evaluation benchmark is developed for the effective automatic evaluation of multi-turn psychological consultations. Competitive experimental results demonstrate the effectiveness of our proposed framework in psychological counseling. We open-source the datasets and model for future research.