VeriLocc: End-to-End Cross-Architecture Register Allocation via LLM

Lesheng Jin, Zhenyuan Ruan, Haohui Mai, Jingbo Shang


Abstract
Modern GPUs evolve rapidly, yet production compilers still rely on hand-crafted register allocation heuristics that require substantial re-tuning for each hardware generation. We introduce VeriLocc, a framework that combines large language models (LLMs) with formal compiler techniques to enable generalizable and verifiable register allocation across GPU architectures. VeriLocc fine-tunes an LLM to translate intermediate representations (MIRs) into target-specific register assignments, aided by static analysis for cross-architecture normalization and generalization and a verifier-guided regeneration loop to ensure correctness. Evaluated on matrix multiplication (GEMM) and multi-head attention (MHA), VeriLocc achieves 85–99% single-shot accuracy and near-100% pass@100. Case study shows that VeriLocc discovers more performant assignments than expert-tuned libraries, outperforming rocBLAS by over 10% in runtime.
Anthology ID:
2025.emnlp-main.1538
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
30240–30250
Language:
URL:
https://aclanthology.org/2025.emnlp-main.1538/
DOI:
Bibkey:
Cite (ACL):
Lesheng Jin, Zhenyuan Ruan, Haohui Mai, and Jingbo Shang. 2025. VeriLocc: End-to-End Cross-Architecture Register Allocation via LLM. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 30240–30250, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
VeriLocc: End-to-End Cross-Architecture Register Allocation via LLM (Jin et al., EMNLP 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.emnlp-main.1538.pdf
Checklist:
 2025.emnlp-main.1538.checklist.pdf