Answering Open-Domain Multi-Answer Questions via a Recall-then-Verify Framework

Zhihong Shao, Minlie Huang


Abstract
Open-domain questions are likely to be open-ended and ambiguous, leading to multiple valid answers. Existing approaches typically adopt the rerank-then-read framework, where a reader reads top-ranking evidence to predict answers. According to our empirical analysis, this framework faces three problems: first, to leverage a large reader under a memory constraint, the reranker should select only a few relevant passages to cover diverse answers, while balancing relevance and diversity is non-trivial; second, the small reading budget prevents the reader from accessing valuable retrieved evidence filtered out by the reranker; third, when using a generative reader to predict answers all at once based on all selected evidence, whether a valid answer will be predicted also pathologically depends on evidence of some other valid answer(s). To address these issues, we propose to answer open-domain multi-answer questions with a recall-then-verify framework, which separates the reasoning process of each answer so that we can make better use of retrieved evidence while also leveraging large models under the same memory constraint. Our framework achieves state-of-the-art results on two multi-answer datasets, and predicts significantly more gold answers than a rerank-then-read system that uses an oracle reranker.
Anthology ID:
2022.acl-long.128
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1825–1838
Language:
URL:
https://aclanthology.org/2022.acl-long.128
DOI:
10.18653/v1/2022.acl-long.128
Bibkey:
Cite (ACL):
Zhihong Shao and Minlie Huang. 2022. Answering Open-Domain Multi-Answer Questions via a Recall-then-Verify Framework. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1825–1838, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Answering Open-Domain Multi-Answer Questions via a Recall-then-Verify Framework (Shao & Huang, ACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.acl-long.128.pdf
Code
 zhihongshao/rectify
Data
Natural Questions