Retrieve What You Need: A Mutual Learning Framework for Open-domain Question Answering

Dingmin Wang, Qiuyuan Huang, Matthew Jackson, Jianfeng Gao


Abstract
An open-domain question answering (QA) system usually follows a retrieve-then-read paradigm, in which a retriever is used to retrieve relevant passages from a large corpus, and then a reader generates answers based on the retrieved passages and the original question. In this paper, we propose a simple and novel mutual learning framework to improve the performance of retrieve-then-read-style models via an intermediate module named the knowledge selector, which we train with reinforcement learning. The key benefits of our proposed intermediate module are: 1) no requirement for additional annotated question-passage pairs; 2) improvements in both retrieval and QA performance, as well as computational efficiency, compared to prior competitive retrieve-then-read models; 3) with no finetuning, improvement in the zero-shot performance of large-scale pre-trained language models, e.g., ChatGPT, by encapsulating the input with relevant knowledge without violating the input length constraint.
Anthology ID:
2024.tacl-1.14
Volume:
Transactions of the Association for Computational Linguistics, Volume 12
Month:
Year:
2024
Address:
Cambridge, MA
Venue:
TACL
SIG:
Publisher:
MIT Press
Note:
Pages:
247–263
Language:
URL:
https://aclanthology.org/2024.tacl-1.14
DOI:
10.1162/tacl_a_00646
Bibkey:
Cite (ACL):
Dingmin Wang, Qiuyuan Huang, Matthew Jackson, and Jianfeng Gao. 2024. Retrieve What You Need: A Mutual Learning Framework for Open-domain Question Answering. Transactions of the Association for Computational Linguistics, 12:247–263.
Cite (Informal):
Retrieve What You Need: A Mutual Learning Framework for Open-domain Question Answering (Wang et al., TACL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.tacl-1.14.pdf