CPsyExam: A Chinese Benchmark for Evaluating Psychology using Examinations

Jiahao Zhao; Jingwei Zhu; Minghuan Tan; Min Yang; Renhao Li; Yang Di; Chenhao Zhang; Guancheng Ye; Chengming Li; Xiping Hu; Derek F. Wong (黄辉)

CPsyExam: A Chinese Benchmark for Evaluating Psychology using Examinations

Jiahao Zhao, Jingwei Zhu, Minghuan Tan, Min Yang, Renhao Li, Yang Di, Chenhao Zhang, Guancheng Ye, Chengming Li, Xiping Hu, Derek F. Wong

Abstract

In this paper, we introduce a novel psychological benchmark, CPsyExam, constructed from questions sourced from Chinese examination systems. CPsyExam is designed to prioritize psychological knowledge and case analysis separately, recognizing the significance of applying psychological knowledge to real-world scenarios. We collect 22k questions from 39 psychology-related subjects across four Chinese examination systems. From the pool of 22k questions, we utilize 4k to create the benchmark that offers balanced coverage of subjects and incorporates a diverse range of case analysis techniques. Furthermore, we evaluate a range of existing large language models (LLMs), spanning from open-sourced to proprietary models. Our experiments and analysis demonstrate that CPsyExam serves as an effective benchmark for enhancing the understanding of psychology within LLMs and enables the comparison of LLMs across various granularities.

Anthology ID:: 2025.coling-main.745
Volume:: Proceedings of the 31st International Conference on Computational Linguistics
Month:: January
Year:: 2025
Address:: Abu Dhabi, UAE
Editors:: Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:: COLING
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 11248–11260
Language:
URL:: https://aclanthology.org/2025.coling-main.745/
DOI:
Bibkey:
Cite (ACL):: Jiahao Zhao, Jingwei Zhu, Minghuan Tan, Min Yang, Renhao Li, Yang Di, Chenhao Zhang, Guancheng Ye, Chengming Li, Xiping Hu, and Derek F. Wong. 2025. CPsyExam: A Chinese Benchmark for Evaluating Psychology using Examinations. In Proceedings of the 31st International Conference on Computational Linguistics, pages 11248–11260, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):: CPsyExam: A Chinese Benchmark for Evaluating Psychology using Examinations (Zhao et al., COLING 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.coling-main.745.pdf

PDF Cite Search Fix data