GeAR: Generation Augmented Retrieval

Haoyu Liu; Shaohan Huang; Jianfeng Liu; Yuefeng Zhan; Hao Sun; Weiwei Deng; Feng Sun; Furu Wei; Qi Zhang

doi:10.18653/v1/2025.findings-acl.166

GeAR: Generation Augmented Retrieval

Haoyu Liu, Shaohan Huang, Jianfeng Liu, Yuefeng Zhan, Hao Sun, Weiwei Deng, Feng Sun, Furu Wei, Qi Zhang

Abstract

Document retrieval techniques are essential for developing large-scale information systems. The common approach involves using a bi-encoder to compute the semantic similarity between a query and documents. However, the scalar similarity often fail to reflect enough information, hindering the interpretation of retrieval results. In addition, this process primarily focuses on global semantics, overlooking the finer-grained semantic relationships between the query and the document’s content. In this paper, we introduce a novel method, Generation Augmented Retrieval (GeAR), which not only improves the global document-query similarity through contrastive learning, but also integrates well-designed fusion and decoding modules. This enables GeAR to generate relevant context within the documents based on a given query, facilitating learning to retrieve local fine-grained information.Furthermore, when used as a retriever, GeAR does not incur any additional computational cost over bi-encoders. GeAR exhibits competitive retrieval performance across diverse scenarios and tasks. Moreover, qualitative analysis and the results generated by GeAR provide novel insights into the interpretation of retrieval results. The code, data, and models will be released at https://github.com/microsoft/LMOps.

Anthology ID:: 2025.findings-acl.166
Volume:: Findings of the Association for Computational Linguistics: ACL 2025
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3193–3207
Language:
URL:: https://aclanthology.org/2025.findings-acl.166/
DOI:: 10.18653/v1/2025.findings-acl.166
Bibkey:
Cite (ACL):: Haoyu Liu, Shaohan Huang, Jianfeng Liu, Yuefeng Zhan, Hao Sun, Weiwei Deng, Feng Sun, Furu Wei, and Qi Zhang. 2025. GeAR: Generation Augmented Retrieval. In Findings of the Association for Computational Linguistics: ACL 2025, pages 3193–3207, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: GeAR: Generation Augmented Retrieval (Liu et al., Findings 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.findings-acl.166.pdf

PDF Cite Search Fix data