AutoProteinEngine: A Large Language Model Driven Agent Framework for Multimodal AutoML in Protein Engineering

Yungeng Liu; Zan Chen; Yuguang Wang; Yiqing Shen

AutoProteinEngine: A Large Language Model Driven Agent Framework for Multimodal AutoML in Protein Engineering

Yungeng Liu, Zan Chen, Yuguang Wang, Yiqing Shen

Abstract

Protein engineering is important for various biomedical applications, but traditional approaches are often inefficient and resource-intensive. While deep learning (DL) models have shown promise, their implementation remains challenging for biologists without specialized computational expertise. To address this gap, we propose AutoProteinEngine (AutoPE), an innovative agent framework that leverages large language models (LLMs) for multimodal automated machine learning (AutoML) in protein engineering. AutoPE introduces a conversational interface that allows biologists without DL backgrounds to interact with DL models using natural language, lowering the entry barrier for protein engineering tasks. Our AutoPE uniquely integrates LLMs with AutoML to handle both protein sequence and graph modalities, automate hyperparameter optimization, and facilitate data retrieval from protein databases. We evaluated AutoPE through two real-world protein engineering tasks, demonstrating substantial improvements in model performance compared to traditional zero-shot and manual fine-tuning approaches. By bridging the gap between DL and biologists’ domain expertise, AutoPE empowers researchers to leverage advanced computational tools without extensive programming knowledge.

Anthology ID:: 2025.coling-industry.36
Volume:: Proceedings of the 31st International Conference on Computational Linguistics: Industry Track
Month:: January
Year:: 2025
Address:: Abu Dhabi, UAE
Editors:: Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert, Kareem Darwish, Apoorv Agarwal
Venue:: COLING
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 422–430
Language:
URL:: https://aclanthology.org/2025.coling-industry.36/
DOI:
Bibkey:
Cite (ACL):: Yungeng Liu, Zan Chen, Yuguang Wang, and Yiqing Shen. 2025. AutoProteinEngine: A Large Language Model Driven Agent Framework for Multimodal AutoML in Protein Engineering. In Proceedings of the 31st International Conference on Computational Linguistics: Industry Track, pages 422–430, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):: AutoProteinEngine: A Large Language Model Driven Agent Framework for Multimodal AutoML in Protein Engineering (Liu et al., COLING 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.coling-industry.36.pdf

PDF Cite Search Fix data