Toward Self-Learning End-to-End Task-oriented Dialog Systems

Xiaoying Zhang, Baolin Peng, Jianfeng Gao, Helen Meng


Abstract
End-to-end task bots are typically learned over a static and usually limited-size corpus. However, when deployed in dynamic, changing, and open environments to interact with users, task bots tend to fail when confronted with data that deviate from the training corpus, i.e., out-of-distribution samples. In this paper, we study the problem of automatically adapting task bots to changing environments by learning from human-bot interactions with minimum or zero human annotations. We propose SL-Agent, a novel self-learning framework for building end-to-end task bots. SL-Agent consists of a dialog model and a pre-trained reward model to predict the quality of an agent response. It enables task bots to automatically adapt to changing environments by learning from the unlabeled human-bot dialog logs accumulated after deployment via reinforcement learning with the incorporated reward model. Experimental results on four well-studied dialog tasks show the effectiveness of SL-Agent to automatically adapt to changing environments, using both automatic and human evaluations. We will release code and data for further research.
Anthology ID:
2022.sigdial-1.49
Volume:
Proceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue
Month:
September
Year:
2022
Address:
Edinburgh, UK
Editors:
Oliver Lemon, Dilek Hakkani-Tur, Junyi Jessy Li, Arash Ashrafzadeh, Daniel Hernández Garcia, Malihe Alikhani, David Vandyke, Ondřej Dušek
Venue:
SIGDIAL
SIG:
SIGDIAL
Publisher:
Association for Computational Linguistics
Note:
Pages:
516–530
Language:
URL:
https://aclanthology.org/2022.sigdial-1.49
DOI:
10.18653/v1/2022.sigdial-1.49
Bibkey:
Cite (ACL):
Xiaoying Zhang, Baolin Peng, Jianfeng Gao, and Helen Meng. 2022. Toward Self-Learning End-to-End Task-oriented Dialog Systems. In Proceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 516–530, Edinburgh, UK. Association for Computational Linguistics.
Cite (Informal):
Toward Self-Learning End-to-End Task-oriented Dialog Systems (Zhang et al., SIGDIAL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.sigdial-1.49.pdf
Video:
 https://youtu.be/FBO3PMW57gU