Beyond Quantity: Trajectory Diversity Scaling for Code Agents

Guhong Chen; Chenghao Sun; Cheng Fu; Qiyao Wang; Zhihong Huang; ChaoPeng Wei; Guangxu Chen; Feiteng Fang; Ahmadreza Argha; Bing Zhao; Xander Xu; Qi Han; Hamid Alinejad-Rokny; Qiang Qu; Binhua Li; Shiwen Ni; Min Yang; HU Wei; Yongbin Li

Beyond Quantity: Trajectory Diversity Scaling for Code Agents

Guhong Chen, Chenghao Sun, Cheng Fu, Qiyao Wang, Zhihong Huang, ChaoPeng Wei, Guangxu Chen, Feiteng Fang, Ahmadreza Argha, Bing Zhao, Xander Xu, Qi Han, Hamid Alinejad-Rokny, Qiang Qu, Binhua Li, Shiwen Ni, Min Yang, HU Wei, Yongbin Li

Abstract

As code large language models (LLMs) evolve into tool-interactive agents via the Model Context Protocol (MCP), their generalization is increasingly limited by low-quality synthetic data and the diminishing returns of quantity scaling; moreover, quantity-centric scaling exhibits an early bottleneck that underutilizes trajectory data. We propose TDScaling, a Trajectory Diversity Scaling-based data synthesis framework for code agents that scales performance through diversity rather than raw volume. Moreover, TDScaling is more data-efficient: under a fixed training budget, increasing trajectory diversity yields larger gains than adding more trajectories, improving the performance-cost trade-off for agent training. TDScaling integrates four innovations: (1) a Business Cluster mechanism that captures real-service logical dependencies; (2) a Blueprint-driven multi-agent paradigm that enforces trajectory coherence; (3) an adaptive evolution mechanism that steers synthesis toward long-tail scenarios using Domain Entropy, Reasoning Mode Entropy, and Cumulative Action Complexity to prevent mode collapse; and (4) a sandboxed code tool that mitigates catastrophic forgetting of intrinsic coding capabilities. Experiments on general tool-use benchmarks (BFCL, 𝜏²-Bench) and code agent tasks (RebenchT, CodeCI, BIRD) demonstrate a win-win outcome: TDScaling improves both tool-use generalization and inherent coding proficiency. Crucially, we show that trajectory diversity scaling attains a substantially higher performance ceiling than quantity scaling, establishing a resource-efficient paradigm for training robust code agents under data bottlenecks.

Anthology ID:: 2026.findings-acl.768
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 15676–15691
Language:
URL:: https://aclanthology.org/2026.findings-acl.768/
DOI:
Bibkey:
Cite (ACL):: Guhong Chen, Chenghao Sun, Cheng Fu, Qiyao Wang, Zhihong Huang, ChaoPeng Wei, Guangxu Chen, Feiteng Fang, Ahmadreza Argha, Bing Zhao, Xander Xu, Qi Han, Hamid Alinejad-Rokny, Qiang Qu, Binhua Li, Shiwen Ni, Min Yang, HU Wei, and Yongbin Li. 2026. Beyond Quantity: Trajectory Diversity Scaling for Code Agents. In Findings of the Association for Computational Linguistics: ACL 2026, pages 15676–15691, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Beyond Quantity: Trajectory Diversity Scaling for Code Agents (Chen et al., Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-acl.768.pdf
Checklist:: 2026.findings-acl.768.checklist.pdf

PDF Cite Search Checklist Fix data