Functional Overlap Reranking for Neural Code Generation

Hung To, Minh Nguyen, Nghi Bui


Abstract
Code Large Language Models (CodeLLMs) have ushered in a new era in code generation advancements. However, selecting the best code solutions from all possible CodeLLM outputs remains a challenge. Previous methods often overlooked the intricate functional similarities and interactions between solution clusters. We introduce SRank, a novel reranking strategy for selecting the best solutions from code generation, focusing on modeling the relationships between clusters of solutions. By quantifying the functional overlap between solution clusters, our approach provides a better ranking strategy for code solutions. Empirical results show that our method achieves remarkable results on the pass@1 score. For instance, on the Human-Eval benchmark, we achieve 69.66% in pass@1 with Codex002, 75.31% with WizardCoder, 53.99% with StarCoder, and 60.55% with CodeGen, surpassing state-of-the-art code generation reranking methods such as CodeT and Coder-Reviewer on the same CodeLLM by a significant margin approx 6.1% improvement on average. Even in scenarios with a limited number of sampled solutions and test cases, our approach demonstrates robustness and superiority, marking a new benchmark in code generation reranking. Our implementation can be found at https://github.com/FSoft-AI4Code/SRank-CodeRanker.
Anthology ID:
2024.findings-acl.220
Volume:
Findings of the Association for Computational Linguistics ACL 2024
Month:
August
Year:
2024
Address:
Bangkok, Thailand and virtual meeting
Editors:
Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3686–3704
Language:
URL:
https://aclanthology.org/2024.findings-acl.220
DOI:
Bibkey:
Cite (ACL):
Hung To, Minh Nguyen, and Nghi Bui. 2024. Functional Overlap Reranking for Neural Code Generation. In Findings of the Association for Computational Linguistics ACL 2024, pages 3686–3704, Bangkok, Thailand and virtual meeting. Association for Computational Linguistics.
Cite (Informal):
Functional Overlap Reranking for Neural Code Generation (To et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-acl.220.pdf