M2RC-EVAL: Massively Multilingual Repository-level Code Completion Evaluation

Jiaheng Liu; Ken Deng; Congnan Liu; Jian Yang; Shukai Liu; He Zhu; Peng Zhao; Linzheng Chai; Yanan Wu; JinKe JinKe; Ge Zhang; Zekun Moore Wang; Guoan Zhang; Yingshui Tan; Bangyu Xiang; Zhaoxiang Zhang; Wenbo Su; Bo Zheng

doi:10.18653/v1/2025.acl-long.763

M2RC-EVAL: Massively Multilingual Repository-level Code Completion Evaluation

Jiaheng Liu, Ken Deng, Congnan Liu, Jian Yang, Shukai Liu, He Zhu, Peng Zhao, Linzheng Chai, Yanan Wu, JinKe JinKe, Ge Zhang, Zekun Moore Wang, Guoan Zhang, Yingshui Tan, Bangyu Xiang, Zhaoxiang Zhang, Wenbo Su, Bo Zheng

Abstract

Repository-level code completion has drawn great attention in software engineering, and several benchmarks have been introduced. However, existing repository-level code completion benchmarks usually focus on a limited number of languages (<5), which cannot evaluate the general code intelligence abilities across different languages for existing code Large Language Models (LLMs). Besides, the existing benchmarks usually report overall average scores of different languages, where the fine-grained abilities in different completion scenarios are ignored. Therefore, to facilitate the research of code LLMs in multilingual scenarios, we propose a massively multilingual repository-level code completion benchmark covering 18 programming languages (called M2RC-EVAL), and two types of fine-grained annotations (i.e., bucket-level and semantic-level) on different completion scenarios are provided, where we obtain these annotations based on the parsed abstract syntax tree. Moreover, we also curate a massively multilingual instruction corpora M2RC-INSTRUCT dataset to improve the repository-level code completion abilities of existing code LLMs. Comprehensive experimental results demonstrate the effectiveness of our M2RC-EVAL and M2RC-INSTRUCT.

Anthology ID:: 2025.acl-long.763
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 15661–15684
Language:
URL:: https://aclanthology.org/2025.acl-long.763/
DOI:: 10.18653/v1/2025.acl-long.763
Bibkey:
Cite (ACL):: Jiaheng Liu, Ken Deng, Congnan Liu, Jian Yang, Shukai Liu, He Zhu, Peng Zhao, Linzheng Chai, Yanan Wu, JinKe JinKe, Ge Zhang, Zekun Moore Wang, Guoan Zhang, Yingshui Tan, Bangyu Xiang, Zhaoxiang Zhang, Wenbo Su, and Bo Zheng. 2025. M2RC-EVAL: Massively Multilingual Repository-level Code Completion Evaluation. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 15661–15684, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: M2RC-EVAL: Massively Multilingual Repository-level Code Completion Evaluation (Liu et al., ACL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.acl-long.763.pdf

PDF Cite Search Fix data