Evaluating Large Language Models for Predicting Protein Behavior under Radiation Exposure and Disease Conditions

Ryan Engel, Gilchan Park


Abstract
The primary concern with exposure to ionizing radiation is the risk of developing diseases. While high doses of radiation can cause immediate damage leading to cancer, the effects of low-dose radiation (LDR) are less clear and more controversial. To further investigate this, it necessitates focusing on the underlying biological structures affected by radiation. Recent work has shown that Large Language Models (LLMs) can effectively predict protein structures and other biological properties. The aim of this research is to utilize open-source LLMs, such as Mistral, Llama 2, and Llama 3, to predict both radiation-induced alterations in proteins and the dynamics of protein-protein interactions (PPIs) within the presence of specific diseases. We show that fine-tuning these models yields state-of-the-art performance for predicting protein interactions in the context of neurodegenerative diseases, metabolic disorders, and cancer. Our findings contribute to the ongoing efforts to understand the complex relationships between radiation exposure and disease mechanisms, illustrating the nuanced capabilities and limitations of current computational models.
Anthology ID:
2024.bionlp-1.34
Volume:
Proceedings of the 23rd Workshop on Biomedical Natural Language Processing
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Dina Demner-Fushman, Sophia Ananiadou, Makoto Miwa, Kirk Roberts, Junichi Tsujii
Venues:
BioNLP | WS
SIG:
SIGBIOMED
Publisher:
Association for Computational Linguistics
Note:
Pages:
427–439
Language:
URL:
https://aclanthology.org/2024.bionlp-1.34
DOI:
10.18653/v1/2024.bionlp-1.34
Bibkey:
Cite (ACL):
Ryan Engel and Gilchan Park. 2024. Evaluating Large Language Models for Predicting Protein Behavior under Radiation Exposure and Disease Conditions. In Proceedings of the 23rd Workshop on Biomedical Natural Language Processing, pages 427–439, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
Evaluating Large Language Models for Predicting Protein Behavior under Radiation Exposure and Disease Conditions (Engel & Park, BioNLP-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.bionlp-1.34.pdf