ProRank: Prompt Warmup via Reinforcement Learning for Small Language Models Reranking

Xianming Li; Aamir Shakir; Rui Huang; Julius Lipp; Benjamin Clavié; Jing Li

ProRank: Prompt Warmup via Reinforcement Learning for Small Language Models Reranking

Xianming LI, Aamir Shakir, Rui Huang, Julius Lipp, Benjamin Clavié, Jing Li

Abstract

Reranking is fundamental to information retrieval and retrieval-augmented generation, with recent Large Language Models (LLMs) significantly advancing reranking quality. Most current works rely on large-scale LLMs (>7B parameters), presenting high computational costs. Small Language Models (SLMs) offer a promising alternative because of computational efficiency. However, our preliminary quantitative analysis reveals key limitations of SLMs: their representation space is narrow, leading to reduced expressiveness, and they struggle with understanding task prompts without fine-tuning. To address these issues, we introduce a novel two-stage training approach, ProRank, for SLM-based document reranking. We propose using reinforcement learning to improve the understanding of task prompts. Additionally, we introduce fine-grained score learning to enhance representation expressiveness and further improve document reranking quality. Extensive experiments suggest that ProRank consistently outperforms both the most advanced open-source and proprietary reranking models. Notably, our ProRank even surpasses powerful LLM reranking models on the BEIR benchmark, establishing that properly trained SLMs can achieve superior document reranking performance while maintaining computational efficiency.

Anthology ID:: 2026.findings-acl.51
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1026–1037
Language:
URL:: https://aclanthology.org/2026.findings-acl.51/
DOI:
Bibkey:
Cite (ACL):: Xianming LI, Aamir Shakir, Rui Huang, Julius Lipp, Benjamin Clavié, and Jing Li. 2026. ProRank: Prompt Warmup via Reinforcement Learning for Small Language Models Reranking. In Findings of the Association for Computational Linguistics: ACL 2026, pages 1026–1037, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: ProRank: Prompt Warmup via Reinforcement Learning for Small Language Models Reranking (LI et al., Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-acl.51.pdf
Checklist:: 2026.findings-acl.51.checklist.pdf

PDF Cite Search Checklist Fix data