Yungeng Liu
2025
AutoProteinEngine: A Large Language Model Driven Agent Framework for Multimodal AutoML in Protein Engineering
Yungeng Liu
|
Zan Chen
|
Yuguang Wang
|
Yiqing Shen
Proceedings of the 31st International Conference on Computational Linguistics: Industry Track
Protein engineering is important for various biomedical applications, but traditional approaches are often inefficient and resource-intensive. While deep learning (DL) models have shown promise, their implementation remains challenging for biologists without specialized computational expertise. To address this gap, we propose AutoProteinEngine (AutoPE), an innovative agent framework that leverages large language models (LLMs) for multimodal automated machine learning (AutoML) in protein engineering. AutoPE introduces a conversational interface that allows biologists without DL backgrounds to interact with DL models using natural language, lowering the entry barrier for protein engineering tasks. Our AutoPE uniquely integrates LLMs with AutoML to handle both protein sequence and graph modalities, automate hyperparameter optimization, and facilitate data retrieval from protein databases. We evaluated AutoPE through two real-world protein engineering tasks, demonstrating substantial improvements in model performance compared to traditional zero-shot and manual fine-tuning approaches. By bridging the gap between DL and biologists’ domain expertise, AutoPE empowers researchers to leverage advanced computational tools without extensive programming knowledge.