Beyond Label Attention: Transparency in Language Models for Automated Medical Coding via Dictionary Learning

John Wu, David Wu, Jimeng Sun


Abstract
Medical coding, the translation of unstructured clinical text into standardized medical codes, is a crucial but time-consuming healthcare practice. Though large language models (LLM) could automate the coding process and improve the efficiency of such tasks, interpretability remains paramount for maintaining patient trust. Current efforts in interpretability of medical coding applications rely heavily on label attention mechanisms, which often leads to the highlighting of extraneous tokens irrelevant to the ICD code. To facilitate accurate interpretability in medical language models, this paper leverages dictionary learning that can efficiently extract sparsely activated representations from dense language model embeddings in superposition. Compared with common label attention mechanisms, our model goes beyond token-level representations by building an interpretable dictionary which enhances the mechanistic-based explanations for each ICD code prediction, even when the highlighted tokens are medically irrelevant. We show that dictionary features are human interpretable, can elucidate the hidden meanings of upwards of 90% of medically irrelevant tokens, and steer model behavior.
Anthology ID:
2024.emnlp-main.500
Volume:
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
8848–8871
Language:
URL:
https://aclanthology.org/2024.emnlp-main.500
DOI:
Bibkey:
Cite (ACL):
John Wu, David Wu, and Jimeng Sun. 2024. Beyond Label Attention: Transparency in Language Models for Automated Medical Coding via Dictionary Learning. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 8848–8871, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Beyond Label Attention: Transparency in Language Models for Automated Medical Coding via Dictionary Learning (Wu et al., EMNLP 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.emnlp-main.500.pdf