Improve Dense Passage Retrieval with Entailment Tuning

Lu Dai, Hao Liu, Hui Xiong


Abstract
Retrieval module can be plugged into many downstream NLP tasks to improve their performance, such as open-domain question answering and retrieval-augmented generation. The key to a retrieval system is to calculate relevance scores to query and passage pairs. However, the definition of relevance is often ambiguous. We observed that a major class of relevance aligns with the concept of entailment in NLI tasks. Based on this observation, we designed a method called entailment tuning to improve the embedding of dense retrievers. Specifically, we unify the form of retrieval data and NLI data using existence claim as a bridge. Then, we train retrievers to predict the claims entailed in a passage with a variant task of masked prediction. Our method can be efficiently plugged into current dense retrieval methods, and experiments show the effectiveness of our method.
Anthology ID:
2024.emnlp-main.636
Volume:
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
11375–11387
Language:
URL:
https://aclanthology.org/2024.emnlp-main.636
DOI:
Bibkey:
Cite (ACL):
Lu Dai, Hao Liu, and Hui Xiong. 2024. Improve Dense Passage Retrieval with Entailment Tuning. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 11375–11387, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Improve Dense Passage Retrieval with Entailment Tuning (Dai et al., EMNLP 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.emnlp-main.636.pdf