Boosting Code Summarization by Embedding Code Structures

Jikyoeng Son, Joonghyuk Hahn, HyeonTae Seo, Yo-Sub Han


Abstract
Recent research on code summarization relies on the structural information from the abstract syntax tree (AST) of source codes. It is, however, questionable whether it is the most effective to use AST for expressing the structural information. We find that a program dependency graph (PDG) can represent the structure of a code more effectively. We propose PDG Boosting Module (PBM) that encodes PDG into graph embedding and the framework to implement the proposed PBM with the existing models. PBM achieves improvements of 6.67% (BLEU) and 7.47% (ROUGE) on average. We then analyze the experimental results, and examine how PBM helps the training of baseline models and its performance robustness. For the validation of robustness, we measure the performance of an out-of-domain benchmark dataset, and confirm its robustness. In addition, we apply a new evaluation measure, SBERT score, to evaluate the semantic performance. The models implemented with PBM improve the performance of SBERT score. This implies that they generate summaries that are semantically more similar to the reference summary.
Anthology ID:
2022.coling-1.521
Volume:
Proceedings of the 29th International Conference on Computational Linguistics
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Editors:
Nicoletta Calzolari, Chu-Ren Huang, Hansaem Kim, James Pustejovsky, Leo Wanner, Key-Sun Choi, Pum-Mo Ryu, Hsin-Hsi Chen, Lucia Donatelli, Heng Ji, Sadao Kurohashi, Patrizia Paggio, Nianwen Xue, Seokhwan Kim, Younggyun Hahm, Zhong He, Tony Kyungil Lee, Enrico Santus, Francis Bond, Seung-Hoon Na
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
5966–5977
Language:
URL:
https://aclanthology.org/2022.coling-1.521
DOI:
Bibkey:
Cite (ACL):
Jikyoeng Son, Joonghyuk Hahn, HyeonTae Seo, and Yo-Sub Han. 2022. Boosting Code Summarization by Embedding Code Structures. In Proceedings of the 29th International Conference on Computational Linguistics, pages 5966–5977, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
Cite (Informal):
Boosting Code Summarization by Embedding Code Structures (Son et al., COLING 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.coling-1.521.pdf
Data
CodeSearchNet