Elucidating Mechanisms of Demographic Bias in LLMs for Healthcare

Hiba Ahsan, Arnab Sen Sharma, Silvio Amir, David Bau, Byron C Wallace


Abstract
We know from prior work that LLMs encode social biases, and that this manifests in clinical tasks. In this work we adopt tools from mechanistic interpretability to unveil sociodemographic representations and biases within LLMs in the context of healthcare. Specifically, we ask: Can we identify activations within LLMs that encode sociodemographic information (e.g., gender, race)? We find that, in three open weight LLMs, gender information is highly localized in MLP layers and can be reliably manipulated at inference time via patching. Such interventions can surgically alter generated clinical vignettes for specific conditions, and also influence downstream clinical predictions which correlate with gender, e.g., patient risk of depression. We find that representation of patient race is somewhat more distributed, but can also be intervened upon, to a degree. To our knowledge, this is the first application of mechanistic interpretability methods to LLMs for healthcare.
Anthology ID:
2025.findings-emnlp.789
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2025
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
14614–14631
Language:
URL:
https://aclanthology.org/2025.findings-emnlp.789/
DOI:
Bibkey:
Cite (ACL):
Hiba Ahsan, Arnab Sen Sharma, Silvio Amir, David Bau, and Byron C Wallace. 2025. Elucidating Mechanisms of Demographic Bias in LLMs for Healthcare. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 14614–14631, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Elucidating Mechanisms of Demographic Bias in LLMs for Healthcare (Ahsan et al., Findings 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.findings-emnlp.789.pdf
Checklist:
 2025.findings-emnlp.789.checklist.pdf