CommitBERT: Commit Message Generation Using Pre-Trained Programming Language Model

Tae Hwan Jung


Abstract
Commit message is a document that summarizes source code changes in natural language. A good commit message clearly shows the source code changes, so this enhances collaboration between developers. Therefore, our work is to develop a model that automatically writes the commit message. To this end, we release 345K datasets consisting of code modification and commit messages in six programming languages (Python, PHP, Go, Java, JavaScript, and Ruby). Similar to the neural machine translation (NMT) model, using our dataset, we feed the code modification to the encoder input and the commit message to the decoder input and measure the result of the generated commit message with BLEU-4. Also, we propose the following two training methods to improve the result of generating the commit message: (1) A method of preprocessing the input to feed the code modification to the encoder input. (2) A method that uses an initial weight suitable for the code domain to reduce the gap in contextual representation between programming language (PL) and natural language (NL).
Anthology ID:
2021.nlp4prog-1.3
Volume:
Proceedings of the 1st Workshop on Natural Language Processing for Programming (NLP4Prog 2021)
Month:
August
Year:
2021
Address:
Online
Editors:
Royi Lachmy, Ziyu Yao, Greg Durrett, Milos Gligoric, Junyi Jessy Li, Ray Mooney, Graham Neubig, Yu Su, Huan Sun, Reut Tsarfaty
Venue:
NLP4Prog
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
26–33
Language:
URL:
https://aclanthology.org/2021.nlp4prog-1.3
DOI:
10.18653/v1/2021.nlp4prog-1.3
Bibkey:
Cite (ACL):
Tae Hwan Jung. 2021. CommitBERT: Commit Message Generation Using Pre-Trained Programming Language Model. In Proceedings of the 1st Workshop on Natural Language Processing for Programming (NLP4Prog 2021), pages 26–33, Online. Association for Computational Linguistics.
Cite (Informal):
CommitBERT: Commit Message Generation Using Pre-Trained Programming Language Model (Jung, NLP4Prog 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.nlp4prog-1.3.pdf
Code
 graykode/commit-autosuggestions
Data
CodeSearchNet