DeTriever: Decoder-representation-based Retriever for Improving NL2SQL In-Context Learning

Raymond Li; Yuxi Feng; Zhenan Fan; Giuseppe Carenini; Weiwei Zhang; Mohammadreza Pourreza; Yong Zhang

DeTriever: Decoder-representation-based Retriever for Improving NL2SQL In-Context Learning

Raymond Li, Yuxi Feng, Zhenan Fan, Giuseppe Carenini, Weiwei Zhang, Mohammadreza Pourreza, Yong Zhang

Abstract

While in-context Learning (ICL) has proven to be an effective technique to improve the performance of Large Language Models (LLMs) in a variety of complex tasks, notably in translating natural language questions into Structured Query Language (NL2SQL), the question of how to select the most beneficial demonstration examples remains an open research problem. While prior works often adapted off-the-shelf encoders to retrieve examples dynamically, an inherent discrepancy exists in the representational capacities between the external retrievers and the LLMs. Further, optimizing the selection of examples is a non-trivial task, since there are no straightforward methods to assess the relative benefits of examples without performing pairwise inference. To address these shortcomings, we propose Detriever, a novel demonstration retrieval framework that learns a weighted combination of LLM hidden states, where rich semantic information is encoded. To train the model, we propose a proxy score that estimates the relative benefits of examples based on the similarities between output queries. Experiments on two popular NL2SQL benchmarks demonstrate that our method significantly outperforms the state-of-the-art baselines for the NL2SQL tasks.

Anthology ID:: 2025.coling-main.544
Volume:: Proceedings of the 31st International Conference on Computational Linguistics
Month:: January
Year:: 2025
Address:: Abu Dhabi, UAE
Editors:: Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:: COLING
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 8173–8183
Language:
URL:: https://aclanthology.org/2025.coling-main.544/
DOI:
Bibkey:
Cite (ACL):: Raymond Li, Yuxi Feng, Zhenan Fan, Giuseppe Carenini, Weiwei Zhang, Mohammadreza Pourreza, and Yong Zhang. 2025. DeTriever: Decoder-representation-based Retriever for Improving NL2SQL In-Context Learning. In Proceedings of the 31st International Conference on Computational Linguistics, pages 8173–8183, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):: DeTriever: Decoder-representation-based Retriever for Improving NL2SQL In-Context Learning (Li et al., COLING 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.coling-main.544.pdf

PDF Cite Search Fix data