WarriorCoder: Learning from Expert Battles to Augment Code Large Language Models

Huawen Feng; Pu Zhao; Qingfeng Sun; Can Xu; Fangkai Yang; Lu Wang; Qianli Ma; Qingwei Lin; Saravan Rajmohan; Dongmei Zhang; Qi Zhang

doi:10.18653/v1/2025.acl-long.246

WarriorCoder: Learning from Expert Battles to Augment Code Large Language Models

Huawen Feng, Pu Zhao, Qingfeng Sun, Can Xu, Fangkai Yang, Lu Wang, Qianli Ma, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang, Qi Zhang

Abstract

Despite recent progress achieved by code large language models (LLMs), their remarkable abilities are largely dependent on fine-tuning on the high-quality data, posing challenges for data collection and annotation. To address this, current methods often design various data flywheels to collect complex code instructions, enabling models to handle more intricate tasks. However, these approaches typically rely on off-the-shelf datasets and data augmentation from a limited set of proprietary LLMs (e.g., Claude, GPT4, and so on), which restricts the diversity of the constructed data and makes it prone to systemic biases. In this paper, we propose **WarriorCoder**, a novel paradigm learns from expert battles to address these limitations. Specifically, we create an arena where leading expert code LLMs challenge each other, with evaluations conducted by impartial judges. This competitive framework generates novel training data from scratch, leveraging the strengths of all participants. Experimental results show that **WarriorCoder** achieves state-of-the-art performance compared to previous models of the same size, even without relying on proprietary LLMs.

Anthology ID:: 2025.acl-long.246
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 4955–4969
Language:
URL:: https://aclanthology.org/2025.acl-long.246/
DOI:: 10.18653/v1/2025.acl-long.246
Bibkey:
Cite (ACL):: Huawen Feng, Pu Zhao, Qingfeng Sun, Can Xu, Fangkai Yang, Lu Wang, Qianli Ma, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang, and Qi Zhang. 2025. WarriorCoder: Learning from Expert Battles to Augment Code Large Language Models. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 4955–4969, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: WarriorCoder: Learning from Expert Battles to Augment Code Large Language Models (Feng et al., ACL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.acl-long.246.pdf

PDF Cite Search Fix data