Chelsea Chandler
2021
Safeguarding against spurious AI-based predictions: The case of automated verbal memory assessment
Chelsea Chandler
|
Peter Foltz
|
Alex Cohen
|
Terje Holmlund
|
Brita Elvevåg
Proceedings of the Seventh Workshop on Computational Linguistics and Clinical Psychology: Improving Access
A growing amount of psychiatric research incorporates machine learning and natural language processing methods, however findings have yet to be translated into actual clinical decision support systems. Many of these studies are based on relatively small datasets in homogeneous populations, which has the associated risk that the models may not perform adequately on new data in real clinical practice. The nature of serious mental illness is that it is hard to define, hard to capture, and requires frequent monitoring, which leads to imperfect data where attribute and class noise are common. With the goal of an effective AI-mediated clinical decision support system, there must be computational safeguards placed on the models used in order to avoid spurious predictions and thus allow humans to review data in the settings where models are unstable or bound not to generalize. This paper describes two approaches to implementing safeguards: (1) the determination of cases in which models are unstable by means of attribute and class based outlier detection and (2) finding the extent to which models show inductive bias. These safeguards are illustrated in the automated scoring of a story recall task via natural language processing methods. With the integration of human-in-the-loop machine learning in the clinical implementation process, incorporating safeguards such as these into the models will offer patients increased protection from spurious predictions.
2019
Overcoming the bottleneck in traditional assessments of verbal memory: Modeling human ratings and classifying clinical group membership
Chelsea Chandler
|
Peter W. Foltz
|
Jian Cheng
|
Jared C. Bernstein
|
Elizabeth P. Rosenfeld
|
Alex S. Cohen
|
Terje B. Holmlund
|
Brita Elvevåg
Proceedings of the Sixth Workshop on Computational Linguistics and Clinical Psychology
Verbal memory is affected by numerous clinical conditions and most neuropsychological and clinical examinations evaluate it. However, a bottleneck exists in such endeavors because traditional methods require expert human review, and usually only a couple of test versions exist, thus limiting the frequency of administration and clinical applications. The present study overcomes this bottleneck by automating the administration, transcription, analysis and scoring of story recall. A large group of healthy participants (n = 120) and patients with mental illness (n = 105) interacted with a mobile application that administered a wide range of assessments, including verbal memory. The resulting speech generated by participants when retelling stories from the memory task was transcribed using automatic speech recognition tools, which was compared with human transcriptions (overall word error rate = 21%). An assortment of surface-level and semantic language-based features were extracted from the verbal recalls. A final set of three features were used to both predict expert human ratings with a ridge regression model (r = 0.88) and to differentiate patients from healthy individuals with an ensemble of logistic regression classifiers (accuracy = 76%). This is the first ‘outside of the laboratory’ study to showcase the viability of the complete pipeline of automated assessment of verbal memory in naturalistic settings.
Search
Co-authors
- Peter Foltz 2
- Brita Elvevåg 2
- Alex Cohen 1
- Terje Holmlund 1
- Jian Cheng 1
- show all...