When to Read Documents or QA History: On Unified and Selective Open-domain QA

Kyungjae Lee, Sang-eun Han, Seung-won Hwang, Moontae Lee


Abstract
This paper studies the problem of open-domain question answering, with the aim of answering a diverse range of questions leveraging knowledge resources. Two types of sources, QA-pair and document corpora, have been actively leveraged with the following complementary strength. The former is highly precise when the paraphrase of given question q was seen and answered during training, often posed as a retrieval problem, while the latter generalizes better for unseen questions. A natural follow-up is thus leveraging both models, while a naive pipelining or integration approaches have failed to bring additional gains over either model alone. Our distinction is interpreting the problem as calibration, which estimates the confidence of predicted answers as an indicator to decide when to use a document or QA-pair corpus. The effectiveness of our method was validated on widely adopted benchmarks such as Natural Questions and TriviaQA.
Anthology ID:
2023.findings-acl.401
Volume:
Findings of the Association for Computational Linguistics: ACL 2023
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6420–6432
Language:
URL:
https://aclanthology.org/2023.findings-acl.401
DOI:
10.18653/v1/2023.findings-acl.401
Bibkey:
Cite (ACL):
Kyungjae Lee, Sang-eun Han, Seung-won Hwang, and Moontae Lee. 2023. When to Read Documents or QA History: On Unified and Selective Open-domain QA. In Findings of the Association for Computational Linguistics: ACL 2023, pages 6420–6432, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
When to Read Documents or QA History: On Unified and Selective Open-domain QA (Lee et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-acl.401.pdf
Video:
 https://aclanthology.org/2023.findings-acl.401.mp4