CoEvolve: Training LLM Agents via Agent-Data Mutual Evolution

Shidong Yang; Ziyu Ma; Tongwen Huang; Yiming Hu; Yong Wang; Xiangxiang Chu

CoEvolve: Training LLM Agents via Agent-Data Mutual Evolution

Shidong Yang, Ziyu Ma, Tongwen Huang, Yiming Hu, Yong Wang, Xiangxiang Chu

Abstract

Reinforcement learning for LLM agents is typically conducted on a static data distribution, which fails to adapt to the agent’s evolving behavior and leads to poor coverage of complex environment interactions. To address these challenges, we propose CoEvolve, an agent-data mutual evolution framework that enables LLM agents to improve through closed-loop, interaction-driven training. Specifically, CoEvolve extracts feedback signals such as forgetting and uncertainty from rollout trajectories to identify failure-prone interaction patterns, and utilizes them to guide LLM-based task synthesis. The synthesized tasks are validated through environment interaction and utilized to update the data distribution, enabling joint adaptation of the agent and its data. Extensive experiments on AppWorld and BFCL across Qwen2.5-7B, Qwen3-4B, and Qwen3-30B-A3B demonstrate consistent and significant improvements over strong base models, yielding absolute gains of 19.43%, 15.58%, and 18.14%, respectively.

Anthology ID:: 2026.acl-long.1055
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 23015–23036
Language:
URL:: https://aclanthology.org/2026.acl-long.1055/
DOI:
Bibkey:
Cite (ACL):: Shidong Yang, Ziyu Ma, Tongwen Huang, Yiming Hu, Yong Wang, and Xiangxiang Chu. 2026. CoEvolve: Training LLM Agents via Agent-Data Mutual Evolution. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 23015–23036, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: CoEvolve: Training LLM Agents via Agent-Data Mutual Evolution (Yang et al., ACL 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.acl-long.1055.pdf
Checklist:: 2026.acl-long.1055.checklist.pdf

PDF Cite Search Checklist Fix data