Information Seeking in the Spirit of Learning: A Dataset for Conversational Curiosity

Pedro Rodriguez; Paul A. Crook; Seungwhan Moon; Zhiguang Wang

doi:10.18653/v1/2020.emnlp-main.655

Information Seeking in the Spirit of Learning: A Dataset for Conversational Curiosity

Pedro Rodriguez, Paul Crook, Seungwhan Moon, Zhiguang Wang

Abstract

Open-ended human learning and information-seeking are increasingly mediated by digital assistants. However, such systems often ignore the user’s pre-existing knowledge. Assuming a correlation between engagement and user responses such as “liking” messages or asking followup questions, we design a Wizard-of-Oz dialog task that tests the hypothesis that engagement increases when users are presented with facts related to what they know. Through crowd-sourcing of this experiment, we collect and release 14K dialogs (181K utterances) where users and assistants converse about geographic topics like geopolitical entities and locations. This dataset is annotated with pre-existing user knowledge, message-level dialog acts, grounding to Wikipedia, and user reactions to messages. Responses using a user’s prior knowledge increase engagement. We incorporate this knowledge into a multi-task model that reproduces human assistant policies and improves over a bert content model by 13 mean reciprocal rank points.

Anthology ID:: 2020.emnlp-main.655
Volume:: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Month:: November
Year:: 2020
Address:: Online
Editors:: Bonnie Webber, Trevor Cohn, Yulan He, Yang Liu
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 8153–8172
Language:
URL:: https://aclanthology.org/2020.emnlp-main.655
DOI:: 10.18653/v1/2020.emnlp-main.655
Bibkey:
Cite (ACL):: Pedro Rodriguez, Paul Crook, Seungwhan Moon, and Zhiguang Wang. 2020. Information Seeking in the Spirit of Learning: A Dataset for Conversational Curiosity. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 8153–8172, Online. Association for Computational Linguistics.
Cite (Informal):: Information Seeking in the Spirit of Learning: A Dataset for Conversational Curiosity (Rodriguez et al., EMNLP 2020)
Copy Citation:
PDF:: https://aclanthology.org/2020.emnlp-main.655.pdf
Video:: https://slideslive.com/38938726
Code: facebookresearch/curiosity
Data: Curiosity, CMU DoG, MS MARCO, QuAC, Topical-Chat, Wizard of Wikipedia

PDF Cite Search Code Video