Unveiling Performance Challenges of Large Language Models in Low-Resource Healthcare: A Demographic Fairness Perspective

Yue Zhou, Barbara Di Eugenio, Lu Cheng


Abstract
This paper studies the performance of large language models (LLMs), particularly regarding demographic fairness, in solving real-world healthcare tasks. We evaluate state-of-the-art LLMs with three prevalent learning frameworks across six diverse healthcare tasks and find significant challenges in applying LLMs to real-world healthcare tasks and persistent fairness issues across demographic groups. We also find that explicitly providing demographic information yields mixed results, while LLM’s ability to infer such details raises concerns about biased health predictions. Utilizing LLMs as autonomous agents with access to up-to-date guidelines does not guarantee performance improvement. We believe these findings reveal the critical limitations of LLMs in healthcare fairness and the urgent need for specialized research in this area.
Anthology ID:
2025.coling-main.485
Volume:
Proceedings of the 31st International Conference on Computational Linguistics
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7266–7278
Language:
URL:
https://aclanthology.org/2025.coling-main.485/
DOI:
Bibkey:
Cite (ACL):
Yue Zhou, Barbara Di Eugenio, and Lu Cheng. 2025. Unveiling Performance Challenges of Large Language Models in Low-Resource Healthcare: A Demographic Fairness Perspective. In Proceedings of the 31st International Conference on Computational Linguistics, pages 7266–7278, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
Unveiling Performance Challenges of Large Language Models in Low-Resource Healthcare: A Demographic Fairness Perspective (Zhou et al., COLING 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.coling-main.485.pdf