Rethinking-based Code Summarization with Chain of Comments

Liuwen Cao, Hongkui He, Hailin Huang, Jiexin Wang, Yi Cai


Abstract
Automatic code summarization aims to generate concise natural language descriptions (summary) for source code, which can free software developers from the heavy burden of manual commenting and software maintenance. Existing methods focus on learning a direct mapping from pure code to summaries, overlooking the significant heterogeneity gap between code and summary. Moreover, existing methods lack a human-like re-check process to evaluate whether the generated summaries match well with the code. To address these two limitations, we introduce RBCoSum, a novel framework that incorporates the generated Chain Of Comments (COC) as auxiliary intermediate information for the model to bridge the gap between code and summaries. Also, we propose a rethinking process where a learned ranker trained on our constructed ranking dataset scores the extent of matching between the generated summary and the code, selecting the highest-scoring summary to achieve a re-check process. We conduct extensive experiments to evaluate our approach and compare it with other automatic code summarization models as well as multiple code Large Language Models (LLMs). The experimental results show that RBCoSum is effective and outperforms baselines by a large margin. The human evaluation also proves the summaries generated with RBCoSum are more natural, informative, useful, and truthful.
Anthology ID:
2025.coling-main.204
Volume:
Proceedings of the 31st International Conference on Computational Linguistics
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3043–3056
Language:
URL:
https://aclanthology.org/2025.coling-main.204/
DOI:
Bibkey:
Cite (ACL):
Liuwen Cao, Hongkui He, Hailin Huang, Jiexin Wang, and Yi Cai. 2025. Rethinking-based Code Summarization with Chain of Comments. In Proceedings of the 31st International Conference on Computational Linguistics, pages 3043–3056, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
Rethinking-based Code Summarization with Chain of Comments (Cao et al., COLING 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.coling-main.204.pdf