Open-World Factually Consistent Question Generation

Himanshu Maheshwari, Sumit Shekhar, Apoorv Saxena, Niyati Chhaya


Abstract
Question generation methods based on pre-trained language models often suffer from factual inconsistencies and incorrect entities and are not answerable from the input paragraph. Domain shift – where the test data is from a different domain than the training data - further exacerbates the problem of hallucination. This is a critical issue for any natural language application doing question generation. In this work, we propose an effective data processing technique based on de-lexicalization for consistent question generation across domains. Unlike existing approaches for remedying hallucination, the proposed approach does not filter training data and is generic across question-generation models. Experimental results across six benchmark datasets show that our model is robust to domain shift and produces entity-level factually consistent questions without significant impact on traditional metrics.
Anthology ID:
2023.findings-acl.151
Volume:
Findings of the Association for Computational Linguistics: ACL 2023
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2390–2404
Language:
URL:
https://aclanthology.org/2023.findings-acl.151
DOI:
10.18653/v1/2023.findings-acl.151
Bibkey:
Cite (ACL):
Himanshu Maheshwari, Sumit Shekhar, Apoorv Saxena, and Niyati Chhaya. 2023. Open-World Factually Consistent Question Generation. In Findings of the Association for Computational Linguistics: ACL 2023, pages 2390–2404, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Open-World Factually Consistent Question Generation (Maheshwari et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-acl.151.pdf