Analyze, Generate and Refine: Query Expansion with LLMs for Zero-Shot Open-Domain QA

Xinran Chen, Xuanang Chen, Ben He, Tengfei Wen, Le Sun


Abstract
Query expansion (QE) is a critical component in the open-domain question answering (OpenQA) pipeline, enhancing the retrieval performance by broadening the scope of queries with additional relevant texts. However, existing methods like GAR and EAR rely heavily on supervised training and often struggle to maintain effectiveness across domains and datasets. Meanwhile, although large language models (LLMs) have demonstrated QE capability for information retrieval (IR) tasks, their application in OpenQA is hindered by the inadequate analysis of query’s informational needs and the lack of quality control for generated QEs, failing to meet the unique requirements of OpenQA. To bridge this gap, we propose a novel LLM-based QE approach named AGR for the OpenQA task, leveraging a three-step prompting strategy. AGR begins with an analysis of the query, followed by the generation of answer-oriented expansions, and culminates with a refinement process for better query formulation. Extensive experiments on four OpenQA datasets reveal that AGR not only rivals in-domain supervised methods in retrieval accuracy, but also outperforms state-of-the-art baselines in out-domain zero-shot scenarios. Moreover, it exhibits enhanced performance in end-to-end QA evaluations, underscoring the superiority of AGR for OpenQA.
Anthology ID:
2024.findings-acl.708
Volume:
Findings of the Association for Computational Linguistics ACL 2024
Month:
August
Year:
2024
Address:
Bangkok, Thailand and virtual meeting
Editors:
Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
11908–11922
Language:
URL:
https://aclanthology.org/2024.findings-acl.708
DOI:
Bibkey:
Cite (ACL):
Xinran Chen, Xuanang Chen, Ben He, Tengfei Wen, and Le Sun. 2024. Analyze, Generate and Refine: Query Expansion with LLMs for Zero-Shot Open-Domain QA. In Findings of the Association for Computational Linguistics ACL 2024, pages 11908–11922, Bangkok, Thailand and virtual meeting. Association for Computational Linguistics.
Cite (Informal):
Analyze, Generate and Refine: Query Expansion with LLMs for Zero-Shot Open-Domain QA (Chen et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-acl.708.pdf