HiRoPE: Length Extrapolation for Code Models Using Hierarchical Position

Kechi Zhang, Ge Li, Huangzhao Zhang, Zhi Jin


Abstract
Addressing the limitation of context length in large language models for code-related tasks is the primary focus of this paper. Existing LLMs are constrained by their pre-trained context lengths, leading to performance issues in handling long complex code sequences. Inspired by how human programmers navigate code, we introduce Hierarchical Rotary Position Embedding (HiRoPE), a novel approach that enhances the traditional rotary position embedding into a hierarchical format based on the hierarchical structure of source code. HiRoPE offers easy integration into existing LLMs without extra training costs. Our method is extensively evaluated with various LLMs, demonstrating stable performance in tasks such as language modeling and long code completion. We also introduce a new long code understanding task with real-world code projects, in hopes of promoting further development in this code-related field. Theoretically and experimentally, we find that HiRoPE also addresses the out-of-distribution issue in position encoding. Our HiRoPE significantly expands the context length capabilities of LLMs, enabling inference at lengths exponentially greater than the training length.
Anthology ID:
2024.acl-long.735
Volume:
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
13615–13627
Language:
URL:
https://aclanthology.org/2024.acl-long.735
DOI:
Bibkey:
Cite (ACL):
Kechi Zhang, Ge Li, Huangzhao Zhang, and Zhi Jin. 2024. HiRoPE: Length Extrapolation for Code Models Using Hierarchical Position. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 13615–13627, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
HiRoPE: Length Extrapolation for Code Models Using Hierarchical Position (Zhang et al., ACL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.acl-long.735.pdf