A Tale of Two Perplexities: Sensitivity of Neural Language Models to Lexical Retrieval Deficits in Dementia of the Alzheimer’s Type

Trevor Cohen, Serguei Pakhomov


Abstract
In recent years there has been a burgeoning interest in the use of computational methods to distinguish between elicited speech samples produced by patients with dementia, and those from healthy controls. The difference between perplexity estimates from two neural language models (LMs) - one trained on transcripts of speech produced by healthy participants and one trained on those with dementia - as a single feature for diagnostic classification of unseen transcripts has been shown to produce state-of-the-art performance. However, little is known about why this approach is effective, and on account of the lack of case/control matching in the most widely-used evaluation set of transcripts (DementiaBank), it is unclear if these approaches are truly diagnostic, or are sensitive to other variables. In this paper, we interrogate neural LMs trained on participants with and without dementia by using synthetic narratives previously developed to simulate progressive semantic dementia by manipulating lexical frequency. We find that perplexity of neural LMs is strongly and differentially associated with lexical frequency, and that using a mixture model resulting from interpolating control and dementia LMs improves upon the current state-of-the-art for models trained on transcript text exclusively.
Anthology ID:
2020.acl-main.176
Original:
2020.acl-main.176v1
Version 2:
2020.acl-main.176v2
Volume:
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2020
Address:
Online
Editors:
Dan Jurafsky, Joyce Chai, Natalie Schluter, Joel Tetreault
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1946–1957
Language:
URL:
https://aclanthology.org/2020.acl-main.176
DOI:
10.18653/v1/2020.acl-main.176
Bibkey:
Cite (ACL):
Trevor Cohen and Serguei Pakhomov. 2020. A Tale of Two Perplexities: Sensitivity of Neural Language Models to Lexical Retrieval Deficits in Dementia of the Alzheimer’s Type. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 1946–1957, Online. Association for Computational Linguistics.
Cite (Informal):
A Tale of Two Perplexities: Sensitivity of Neural Language Models to Lexical Retrieval Deficits in Dementia of the Alzheimer’s Type (Cohen & Pakhomov, ACL 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.acl-main.176.pdf
Video:
 http://slideslive.com/38929174
Code
 treversec/tale_of_two_perplexities
Data
WikiText-2