%0 Conference Proceedings
%T Conversational Graph Grounded Policy Learning for Open-Domain Conversation Generation
%A Xu, Jun
%A Wang, Haifeng
%A Niu, Zheng-Yu
%A Wu, Hua
%A Che, Wanxiang
%A Liu, Ting
%Y Jurafsky, Dan
%Y Chai, Joyce
%Y Schluter, Natalie
%Y Tetreault, Joel
%S Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
%D 2020
%8 July
%I Association for Computational Linguistics
%C Online
%F xu-etal-2020-conversational
%X To address the challenge of policy learning in open-domain multi-turn conversation, we propose to represent prior information about dialog transitions as a graph and learn a graph grounded dialog policy, aimed at fostering a more coherent and controllable dialog. To this end, we first construct a conversational graph (CG) from dialog corpora, in which there are vertices to represent “what to say” and “how to say”, and edges to represent natural transition between a message (the last utterance in a dialog context) and its response. We then present a novel CG grounded policy learning framework that conducts dialog flow planning by graph traversal, which learns to identify a what-vertex and a how-vertex from the CG at each turn to guide response generation. In this way, we effectively leverage the CG to facilitate policy learning as follows: (1) it enables more effective long-term reward design, (2) it provides high-quality candidate actions, and (3) it gives us more control over the policy. Results on two benchmark corpora demonstrate the effectiveness of this framework.
%R 10.18653/v1/2020.acl-main.166
%U https://aclanthology.org/2020.acl-main.166
%U https://doi.org/10.18653/v1/2020.acl-main.166
%P 1835-1845