CycleKQR: Unsupervised Bidirectional Keyword-Question Rewriting

Andrea Iovine, Anjie Fang, Besnik Fetahu, Jie Zhao, Oleg Rokhlenko, Shervin Malmasi


Abstract
Users expect their queries to be answered by search systems, regardless of the query’s surface form, which include keyword queries and natural questions. Natural Language Understanding (NLU) components of Search and QA systems may fail to correctly interpret semantically equivalent inputs if this deviates from how the system was trained, leading to suboptimal understanding capabilities. We propose the keyword-question rewriting task to improve query understanding capabilities of NLU systems for all surface forms. To achieve this, we present CycleKQR, an unsupervised approach, enabling effective rewriting between keyword and question queries using non-parallel data. Empirically we show the impact on QA performance of unfamiliar query forms for open domain and Knowledge Base QA systems (trained on either keywords or natural language questions). We demonstrate how CycleKQR significantly improves QA performance by rewriting queries into the appropriate form, while at the same time retaining the original semantic meaning of input queries, allowing CycleKQR to improve performance by up to 3% over supervised baselines. Finally, we release a datasetof 66k keyword-question pairs.
Anthology ID:
2022.emnlp-main.814
Volume:
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Editors:
Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
11875–11886
Language:
URL:
https://aclanthology.org/2022.emnlp-main.814
DOI:
10.18653/v1/2022.emnlp-main.814
Bibkey:
Cite (ACL):
Andrea Iovine, Anjie Fang, Besnik Fetahu, Jie Zhao, Oleg Rokhlenko, and Shervin Malmasi. 2022. CycleKQR: Unsupervised Bidirectional Keyword-Question Rewriting. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 11875–11886, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
CycleKQR: Unsupervised Bidirectional Keyword-Question Rewriting (Iovine et al., EMNLP 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.emnlp-main.814.pdf