Generative Multi-hop Retrieval

Hyunji Lee; Sohee Yang; Hanseok Oh; Minjoon Seo

doi:10.18653/v1/2022.emnlp-main.92

Generative Multi-hop Retrieval

Hyunji Lee, Sohee Yang, Hanseok Oh, Minjoon Seo

Abstract

A common practice for text retrieval is to use an encoder to map the documents and the query to a common vector space and perform a nearest neighbor search (NNS); multi-hop retrieval also often adopts the same paradigm, usually with a modification of iteratively reformulating the query vector so that it can retrieve different documents at each hop. However, such a bi-encoder approach has limitations in multi-hop settings; (1) the reformulated query gets longer as the number of hops increases, which further tightens the embedding bottleneck of the query vector, and (2) it is prone to error propagation. In this paper, we focus on alleviating these limitations in multi-hop settings by formulating the problem in a fully generative way. We propose an encoder-decoder model that performs multi-hop retrieval by simply generating the entire text sequences of the retrieval targets, which means the query and the documents interact in the language model’s parametric space rather than L2 or inner product space as in the bi-encoder approach. Our approach, Generative Multi-hop Retrieval (GMR), consistently achieves comparable or higher performance than bi-encoder models in five datasets while demonstrating superior GPU memory and storage footprint.

Anthology ID:: 2022.emnlp-main.92
Volume:: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Month:: December
Year:: 2022
Address:: Abu Dhabi, United Arab Emirates
Editors:: Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1417–1436
Language:
URL:: https://aclanthology.org/2022.emnlp-main.92/
DOI:: 10.18653/v1/2022.emnlp-main.92
Bibkey:
Cite (ACL):: Hyunji Lee, Sohee Yang, Hanseok Oh, and Minjoon Seo. 2022. Generative Multi-hop Retrieval. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 1417–1436, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):: Generative Multi-hop Retrieval (Lee et al., EMNLP 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.emnlp-main.92.pdf
Video:: https://aclanthology.org/2022.emnlp-main.92.mp4

PDF Cite Search Video Fix data