MIMIR: A Customizable Agent Tuning Platform for Enhanced Scientific Applications

Xiangru Tang, Chunyuan Deng, Hanminwang Hanminwang, Haoran Wang, Yilun Zhao, Wenqi Shi, Yi Fung, Wangchunshu Zhou, Jiannan Cao, Heng Ji, Arman Cohan, Mark Gerstein


Abstract
Recently, large language models (LLMs) have evolved into interactive agents, proficient in planning, tool use, and task execution across various tasks. However, without agent-tuning, open-source models like LLaMA2 currently struggle to match the efficiency of larger models such as GPT-4 in scientific applications due to a lack of agent tuning datasets. In response, we introduce MIMIR, a streamlined platform that leverages large LLMs to generate agent-tuning data for fine-tuning smaller, specialized models. By employing a role-playing methodology, MIMIR enables larger models to simulate various roles and create interaction data, which can then be used to fine-tune open-source models like LLaMA2. This approach ensures that even smaller models can effectively serve as agents in scientific tasks. Integrating these features into an end-to-end platform, MIMIR facilitates everything from the uploading of scientific data to one-click agent fine-tuning. MIMIR is publicly released and actively maintained at https://github. com/gersteinlab/MIMIR, along with a demo video for quick-start, calling for broader development.
Anthology ID:
2024.emnlp-demo.49
Volume:
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Delia Irazu Hernandez Farias, Tom Hope, Manling Li
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
486–496
Language:
URL:
https://aclanthology.org/2024.emnlp-demo.49
DOI:
Bibkey:
Cite (ACL):
Xiangru Tang, Chunyuan Deng, Hanminwang Hanminwang, Haoran Wang, Yilun Zhao, Wenqi Shi, Yi Fung, Wangchunshu Zhou, Jiannan Cao, Heng Ji, Arman Cohan, and Mark Gerstein. 2024. MIMIR: A Customizable Agent Tuning Platform for Enhanced Scientific Applications. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 486–496, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
MIMIR: A Customizable Agent Tuning Platform for Enhanced Scientific Applications (Tang et al., EMNLP 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.emnlp-demo.49.pdf