Evaluating Large Language Models for Enhancing Live Chat Therapy: A Comparative Study with Psychotherapists

Neha Pravin Deshpande; Stefan Hillmann; Sebastian Möller

Evaluating Large Language Models for Enhancing Live Chat Therapy: A Comparative Study with Psychotherapists

Neha Pravin Deshpande, Stefan Hillmann, Sebastian Möller

Abstract

Large Language Models (LLMs) hold promise for addressing the shortage of qualified therapists in mental health care. While chatbot-based Cognitive Behavioral Therapy (CBT) tools exist, their efficacy in sensitive contexts remains underexplored. This study examines the potential of LLMs to support therapy sessions aimed at reducing Child Sexual Abuse Material (CSAM) consumption. We propose a Retrieval-Augmented Generation (RAG) framework that leverages a fine-tuned BERT-based retriever to guide LLM-generated responses, better capturing the multi-turn, context-specific dynamics of therapy. Four LLMs—Qwen2-7B-Instruct, Mistral-7B-Instruct-v0.3, Orca-2-13B, and Zephyr-7B-Alpha—were evaluated in a small-scale study with 14 domain-expert psychotherapists. Our comparative analysis reveals that, in certain scenarios, LLMs like Mistral-7B-Instruct-v0.3 and Orca-2-13B were preferred over human therapist responses. While limited by sample size, these findings suggest that LLMs can perform at a level comparable to or even exceeding that of human therapists, especially in therapy focused on reducing CSAM consumption. Our code is available online: https://git.tu-berlin.de/neha.deshpande/therapy_responses/-/tree/main

Anthology ID:: 2025.sigdial-1.59
Volume:: Proceedings of the 26th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Month:: August
Year:: 2025
Address:: Avignon, France
Editors:: Frédéric Béchet, Fabrice Lefèvre, Nicholas Asher, Seokhwan Kim, Teva Merlin
Venue:: SIGDIAL
SIG:: SIGDIAL
Publisher:: Association for Computational Linguistics
Note:
Pages:: 800–812
Language:
URL:: https://aclanthology.org/2025.sigdial-1.59/
DOI:
Bibkey:
Cite (ACL):: Neha Pravin Deshpande, Stefan Hillmann, and Sebastian Möller. 2025. Evaluating Large Language Models for Enhancing Live Chat Therapy: A Comparative Study with Psychotherapists. In Proceedings of the 26th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 800–812, Avignon, France. Association for Computational Linguistics.
Cite (Informal):: Evaluating Large Language Models for Enhancing Live Chat Therapy: A Comparative Study with Psychotherapists (Deshpande et al., SIGDIAL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.sigdial-1.59.pdf

PDF Cite Search Fix data