Code Vulnerability Detection via Nearest Neighbor Mechanism

Qianjin Du, Xiaohui Kuang, Gang Zhao


Abstract
Code vulnerability detection is a fundamental and challenging task in the software security field. Existing research works aim to learn semantic information from the source code by utilizing NLP technologies. However, in vulnerability detection tasks, some vulnerable samples are very similar to non-vulnerable samples, which are difficult to identify. To address this issue and improve detection performance, we introduce the k-nearest neighbor mechanism which retrieves multiple neighbor samples and utilizes label information of retrieved neighbor samples to provide help for model predictions. Besides, we use supervised contrastive learning to make the model learn the discriminative representation and ensure that label information of retrieved neighbor samples is as consistent as possible with the label information of testing samples. Extensive experiments show that our method can achieve obvious performance improvements compared to baseline models.
Anthology ID:
2022.findings-emnlp.459
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2022
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Editors:
Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6173–6178
Language:
URL:
https://aclanthology.org/2022.findings-emnlp.459
DOI:
10.18653/v1/2022.findings-emnlp.459
Bibkey:
Cite (ACL):
Qianjin Du, Xiaohui Kuang, and Gang Zhao. 2022. Code Vulnerability Detection via Nearest Neighbor Mechanism. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 6173–6178, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
Code Vulnerability Detection via Nearest Neighbor Mechanism (Du et al., Findings 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.findings-emnlp.459.pdf