Natalie Shapira


pdf bib
Clever Hans or Neural Theory of Mind? Stress Testing Social Reasoning in Large Language Models
Natalie Shapira | Mosh Levy | Seyed Hossein Alavi | Xuhui Zhou | Yejin Choi | Yoav Goldberg | Maarten Sap | Vered Shwartz
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)

The escalating debate on AI’s capabilities warrants developing reliable metrics to assess machine “intelligence.” Recently, many anecdotal examples were used to suggest that newer Large Language Models (LLMs) like ChatGPT and GPT-4 exhibit Neural Theory-of-Mind (N-ToM); however, prior work reached conflicting conclusions regarding those abilities. We investigate the extent of LLMs’ N-ToM through an extensive evaluation of 6 tasks and find that while LLMs exhibit certain N-ToM abilities, this behavior is far from being robust. We further examine the factors impacting performance on N-ToM tasks and discover that LLMs struggle with adversarial examples, indicating reliance on shallow heuristics rather than robust ToM abilities. We caution against drawing conclusions from anecdotal examples, limited benchmark testing, and using human-designed psychological tests to evaluate models.

pdf bib
Therapist Self-Disclosure as a Natural Language Processing Task
Natalie Shapira | Tal Alfi-Yogev
Proceedings of the 9th Workshop on Computational Linguistics and Clinical Psychology (CLPsych 2024)

Therapist Self-Disclosure (TSD) within the context of psychotherapy entails the revelation of personal information by the therapist. The ongoing scholarly discourse surrounding the utility of TSD, spanning from the inception of psychotherapy to the present day, has underscored the need for greater specificity in conceptualizing TSD. This inquiry has yielded more refined classifications within the TSD domain, with a consensus emerging on the distinction between immediate and non-immediate TSD, each of which plays a distinct role in the therapeutic process. Despite this progress in the field of psychotherapy, the Natural Language Processing (NLP) domain currently lacks methodological solutions or explorations for such scenarios. This lacuna can be partly due to the difficulty of attaining publicly available clinical data. To address this gap, this paper presents an innovative NLP-based approach that formalizes TSD as an NLP task. The proposed methodology involves the creation of publicly available, expert-annotated test sets designed to simulate therapist utterances, and the employment of NLP techniques for evaluation purposes. By integrating insights from psychotherapy research with NLP methodologies, this study aims to catalyze advancements in both NLP and psychotherapy research.


pdf bib
How Well Do Large Language Models Perform on Faux Pas Tests?
Natalie Shapira | Guy Zwirn | Yoav Goldberg
Findings of the Association for Computational Linguistics: ACL 2023

Motivated by the question of the extent to which large language models “understand” social intelligence, we investigate the ability of such models to generate correct responses to questions involving descriptions of faux pas situations. The faux pas test is a test used in clinical psychology, which is known to be more challenging for children than individual tests of theory-of-mind or social intelligence. Our results demonstrate that, while the models seem to sometimes offer correct responses, they in fact struggle with this task, and that many of the seemingly correct responses can be attributed to over-interpretation by the human reader (“the ELIZA effect”). An additional phenomenon observed is the failure of most models to generate a correct response to presupposition questions. Finally, in an experiment in which the models are tasked with generating original faux pas stories, we find that while some models are capable of generating novel faux pas stories, the stories are all explicit, as the models are limited in their abilities to describe situations in an implicit manner.


pdf bib
Measuring Linguistic Synchrony in Psychotherapy
Natalie Shapira | Dana Atzil-Slonim | Rivka Tuval Mashiach | Ori Shapira
Proceedings of the Eighth Workshop on Computational Linguistics and Clinical Psychology

We study the phenomenon of linguistic synchrony between clients and therapists in a psychotherapy process. Linguistic Synchrony (LS) can be viewed as any observed interdependence or association between more than one person?s linguistic behavior. Accordingly, we establish LS as a methodological task. We suggest a LS function that applies a linguistic similarity measure based on the Jensen-Shannon distance across the observed part-of-speech tag distributions (JSDuPos) of the speakers in different time frames. We perform a study over a unique corpus of 872 transcribed sessions, covering 68 clients and 59 therapists. After establishing the presence of client-therapist LS, we verify its association with therapeutic alliance and treatment outcome (measured using WAI and ORS), and additionally analyse the behavior of JSDuPos throughout treatment. Results indicate that (1) higher linguistic similarity at the session level associates with higher therapeutic alliance as reported by the client and therapist at the end of the session, (2) higher linguistic similarity at the session level associates with higher level of treatment outcome as reported by the client at the beginnings of the next sessions, (3) there is a significant linear increase in linguistic similarity throughout treatment, (4) surprisingly, higher LS associates with lower treatment outcome. Finally, we demonstrate how the LS function can be used to interpret and explore the mechanism for synchrony.


pdf bib
Hebrew Psychological Lexicons
Natalie Shapira | Dana Atzil-Slonim | Daniel Juravski | Moran Baruch | Dana Stolowicz-Melman | Adar Paz | Tal Alfi-Yogev | Roy Azoulay | Adi Singer | Maayan Revivo | Chen Dahbash | Limor Dayan | Tamar Naim | Lidar Gez | Boaz Yanai | Adva Maman | Adam Nadaf | Elinor Sarfati | Amna Baloum | Tal Naor | Ephraim Mosenkis | Badreya Sarsour | Jany Gelfand Morgenshteyn | Yarden Elias | Liat Braun | Moria Rubin | Matan Kenigsbuch | Noa Bergwerk | Noam Yosef | Sivan Peled | Coral Avigdor | Rahav Obercyger | Rachel Mann | Tomer Alper | Inbal Beka | Ori Shapira | Yoav Goldberg
Proceedings of the Seventh Workshop on Computational Linguistics and Clinical Psychology: Improving Access

We introduce a large set of Hebrew lexicons pertaining to psychological aspects. These lexicons are useful for various psychology applications such as detecting emotional state, well being, relationship quality in conversation, identifying topics (e.g., family, work) and many more. We discuss the challenges in creating and validating lexicons in a new language, and highlight our methodological considerations in the data-driven lexicon construction process. Most of the lexicons are publicly available, which will facilitate further research on Hebrew clinical psychology text analysis. The lexicons were developed through data driven means, and verified by domain experts, clinical psychologists and psychology students, in a process of reconciliation with three judges. Development and verification relied on a dataset of a total of 872 psychotherapy session transcripts. We describe the construction process of each collection, the final resource and initial results of research studies employing this resource.

pdf bib
Automatic Identification of Ruptures in Transcribed Psychotherapy Sessions
Adam Tsakalidis | Dana Atzil-Slonim | Asaf Polakovski | Natalie Shapira | Rivka Tuval-Mashiach | Maria Liakata
Proceedings of the Seventh Workshop on Computational Linguistics and Clinical Psychology: Improving Access

We present the first work on automatically capturing alliance rupture in transcribed therapy sessions, trained on the text and self-reported rupture scores from both therapists and clients. Our NLP baseline outperforms a strong majority baseline by a large margin and captures client reported ruptures unidentified by therapists in 40% of such cases.