ReACC: A Retrieval-Augmented Code Completion Framework

Shuai Lu; Nan Duan; Hojae Han; Daya Guo; Seung-won Hwang; Alexey Svyatkovskiy

doi:10.18653/v1/2022.acl-long.431

ReACC: A Retrieval-Augmented Code Completion Framework

Shuai Lu, Nan Duan, Hojae Han, Daya Guo, Seung-won Hwang, Alexey Svyatkovskiy

Abstract

Code completion, which aims to predict the following code token(s) according to the code context, can improve the productivity of software development. Recent work has proved that statistical language modeling with transformers can greatly improve the performance in the code completion task via learning from large-scale source code datasets. However, current approaches focus only on code context within the file or project, i.e. internal context. Our distinction is utilizing ”external” context, inspired by human behaviors of copying from the related code snippets when writing code. Specifically, we propose a retrieval-augmented code completion framework, leveraging both lexical copying and referring to code with similar semantics by retrieval. We adopt a stage-wise training approach that combines a source code retriever and an auto-regressive language model for programming language. We evaluate our approach in the code completion task in Python and Java programming languages, achieving a state-of-the-art performance on CodeXGLUE benchmark.

Anthology ID:: 2022.acl-long.431
Volume:: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: May
Year:: 2022
Address:: Dublin, Ireland
Editors:: Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 6227–6240
Language:
URL:: https://aclanthology.org/2022.acl-long.431
DOI:: 10.18653/v1/2022.acl-long.431
Bibkey:
Cite (ACL):: Shuai Lu, Nan Duan, Hojae Han, Daya Guo, Seung-won Hwang, and Alexey Svyatkovskiy. 2022. ReACC: A Retrieval-Augmented Code Completion Framework. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6227–6240, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):: ReACC: A Retrieval-Augmented Code Completion Framework (Lu et al., ACL 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.acl-long.431.pdf
Code: celbree/reacc
Data: CodeSearchNet, CodeXGLUE

PDF Cite Search Code