Synthetic Question Value Estimation for Domain Adaptation of Question Answering

Xiang Yue, Ziyu Yao, Huan Sun


Abstract
Synthesizing QA pairs with a question generator (QG) on the target domain has become a popular approach for domain adaptation of question answering (QA) models. Since synthetic questions are often noisy in practice, existing work adapts scores from a pretrained QA (or QG) model as criteria to select high-quality questions. However, these scores do not directly serve the ultimate goal of improving QA performance on the target domain. In this paper, we introduce a novel idea of training a question value estimator (QVE) that directly estimates the usefulness of synthetic questions for improving the target-domain QA performance. By conducting comprehensive experiments, we show that the synthetic questions selected by QVE can help achieve better target-domain QA performance, in comparison with existing techniques. We additionally show that by using such questions and only around 15% of the human annotations on the target domain, we can achieve comparable performance to the fully-supervised baselines.
Anthology ID:
2022.acl-long.95
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1340–1351
Language:
URL:
https://aclanthology.org/2022.acl-long.95
DOI:
10.18653/v1/2022.acl-long.95
Bibkey:
Cite (ACL):
Xiang Yue, Ziyu Yao, and Huan Sun. 2022. Synthetic Question Value Estimation for Domain Adaptation of Question Answering. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1340–1351, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Synthetic Question Value Estimation for Domain Adaptation of Question Answering (Yue et al., ACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.acl-long.95.pdf
Code
 xiangyue9607/qve
Data
HotpotQANatural QuestionsNewsQATriviaQA