On the Embeddings of Variables in Recurrent Neural Networks for Source Code

Nadezhda Chirkova


Abstract
Source code processing heavily relies on the methods widely used in natural language processing (NLP), but involves specifics that need to be taken into account to achieve higher quality. An example of this specificity is that the semantics of a variable is defined not only by its name but also by the contexts in which the variable occurs. In this work, we develop dynamic embeddings, a recurrent mechanism that adjusts the learned semantics of the variable when it obtains more information about the variable’s role in the program. We show that using the proposed dynamic embeddings significantly improves the performance of the recurrent neural network, in code completion and bug fixing tasks.
Anthology ID:
2021.naacl-main.213
Volume:
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:
June
Year:
2021
Address:
Online
Editors:
Kristina Toutanova, Anna Rumshisky, Luke Zettlemoyer, Dilek Hakkani-Tur, Iz Beltagy, Steven Bethard, Ryan Cotterell, Tanmoy Chakraborty, Yichao Zhou
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2679–2689
Language:
URL:
https://aclanthology.org/2021.naacl-main.213
DOI:
10.18653/v1/2021.naacl-main.213
Bibkey:
Cite (ACL):
Nadezhda Chirkova. 2021. On the Embeddings of Variables in Recurrent Neural Networks for Source Code. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2679–2689, Online. Association for Computational Linguistics.
Cite (Informal):
On the Embeddings of Variables in Recurrent Neural Networks for Source Code (Chirkova, NAACL 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.naacl-main.213.pdf
Video:
 https://aclanthology.org/2021.naacl-main.213.mp4