GeoRA: Geometry-Aware Low-Rank Adaptation for RLVR

Jiaying Zhang; Lei Shi; Jiguo Li; Jun Xu; Jiuchong Gao; Jinghua Hao; Renqing He

GeoRA: Geometry-Aware Low-Rank Adaptation for RLVR

Jiaying Zhang, Lei Shi, Jiguo Li, Jun Xu, Jiuchong Gao, Jinghua Hao, Renqing He

Abstract

Reinforcement Learning with Verifiable Rewards (RLVR) is a key paradigm for improving large-scale reasoning models. Unlike supervised fine-tuning (SFT), RLVR exhibits distinct optimization dynamics and are sensitive to the preservation of pre-trained geometric structures. However, existing parameter-efficient methods face key limitations in this regime. Low-rank adaptation methods, such as PiSSA, are primarily designed for Supervised Fine-Tuning (SFT) and do not account for the distinct optimization dynamics and geometric structures of RLVR. Conversely, directly fine-tuning the unstructured sparse parameter subspace favored by RLVR encounters efficiency bottlenecks on modern hardware. To address these challenges, we propose GeoRA (Geometry-Aware Low-Rank Adaptation), a low-rank adaptation method tailored for RLVR. Specifically, GeoRA exploits the anisotropic and compressible structure of RL update subspace, and extracts its principal directions via Singular Value Decomposition (SVD) to initialize low-rank adapters, while freezing residual components as a structural anchor during training. This design preserves the pre-trained structure and enables efficient dense computation. Experiments on Qwen and Llama models from 1.5B to 32B parameters show that GeoRA consistently outperforms strong low-rank baselines across RLVR settings in mathematics, medicine, and coding, while showing stronger generalization and less forgetting on out-of-domain tasks.

Anthology ID:: 2026.acl-long.1110
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 24207–24221
Language:
URL:: https://aclanthology.org/2026.acl-long.1110/
DOI:
Bibkey:
Cite (ACL):: Jiaying Zhang, Lei Shi, Jiguo Li, Jun Xu, Jiuchong Gao, Jinghua Hao, and Renqing He. 2026. GeoRA: Geometry-Aware Low-Rank Adaptation for RLVR. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 24207–24221, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: GeoRA: Geometry-Aware Low-Rank Adaptation for RLVR (Zhang et al., ACL 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.acl-long.1110.pdf
Checklist:: 2026.acl-long.1110.checklist.pdf

PDF Cite Search Checklist Fix data