Multi-Sentence Knowledge Selection in Open-Domain Dialogue

Mihail Eric, Nicole Chartier, Behnam Hedayatnia, Karthik Gopalakrishnan, Pankaj Rajan, Yang Liu, Dilek Hakkani-Tur


Abstract
Incorporating external knowledge sources effectively in conversations is a longstanding problem in open-domain dialogue research. The existing literature on open-domain knowledge selection is limited and makes certain brittle assumptions on knowledge sources to simplify the overall task, such as the existence of a single relevant knowledge sentence per context. In this work, we evaluate the existing state of open-domain conversation knowledge selection, showing where the existing methodologies regarding data and evaluation are flawed. We then improve on them by proposing a new framework for collecting relevant knowledge, and create an augmented dataset based on the Wizard of Wikipedia (WOW) corpus, which we call WOW++. WOW++ averages 8 relevant knowledge sentences per dialogue context, embracing the inherent ambiguity of open-domain dialogue knowledge selection. We then benchmark various knowledge ranking algorithms on this augmented dataset with both intrinsic evaluation and extrinsic measures of response quality, showing that neural rerankers that use WOW++ can outperform rankers trained on standard datasets.
Anthology ID:
2021.inlg-1.9
Volume:
Proceedings of the 14th International Conference on Natural Language Generation
Month:
August
Year:
2021
Address:
Aberdeen, Scotland, UK
Editors:
Anya Belz, Angela Fan, Ehud Reiter, Yaji Sripada
Venue:
INLG
SIG:
SIGGEN
Publisher:
Association for Computational Linguistics
Note:
Pages:
76–86
Language:
URL:
https://aclanthology.org/2021.inlg-1.9
DOI:
10.18653/v1/2021.inlg-1.9
Bibkey:
Cite (ACL):
Mihail Eric, Nicole Chartier, Behnam Hedayatnia, Karthik Gopalakrishnan, Pankaj Rajan, Yang Liu, and Dilek Hakkani-Tur. 2021. Multi-Sentence Knowledge Selection in Open-Domain Dialogue. In Proceedings of the 14th International Conference on Natural Language Generation, pages 76–86, Aberdeen, Scotland, UK. Association for Computational Linguistics.
Cite (Informal):
Multi-Sentence Knowledge Selection in Open-Domain Dialogue (Eric et al., INLG 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.inlg-1.9.pdf
Code
 alexa/wow-plus-plus
Data
Wizard of Wikipedia