StepCoder: Improving Code Generation with Reinforcement Learning from Compiler Feedback Shihan Dou author Yan Liu author Haoxiang Jia author Enyu Zhou author Limao Xiong author Junjie Shan author Caishuang Huang author Xiao Wang author Xiaoran Fan author Zhiheng Xi author Yuhao Zhou author Tao Ji author Rui Zheng author Qi Zhang author Tao Gui author Xuanjing Huang author 2024-08 text Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) Lun-Wei Ku editor Andre Martins editor Vivek Srikumar editor Association for Computational Linguistics Bangkok, Thailand conference publication dou-etal-2024-stepcoder 10.18653/v1/2024.acl-long.251 https://aclanthology.org/2024.acl-long.251/ 2024-08 4571 4585