Affordable On-line Dialogue Policy Learning

Cheng Chang, Runzhe Yang, Lu Chen, Xiang Zhou, Kai Yu


Abstract
The key to building an evolvable dialogue system in real-world scenarios is to ensure an affordable on-line dialogue policy learning, which requires the on-line learning process to be safe, efficient and economical. But in reality, due to the scarcity of real interaction data, the dialogue system usually grows slowly. Besides, the poor initial dialogue policy easily leads to bad user experience and incurs a failure of attracting users to contribute training data, so that the learning process is unsustainable. To accurately depict this, two quantitative metrics are proposed to assess safety and efficiency issues. For solving the unsustainable learning problem, we proposed a complete companion teaching framework incorporating the guidance from the human teacher. Since the human teaching is expensive, we compared various teaching schemes answering the question how and when to teach, to economically utilize teaching budget, so that make the online learning process affordable.
Anthology ID:
D17-1234
Volume:
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
Month:
September
Year:
2017
Address:
Copenhagen, Denmark
Editors:
Martha Palmer, Rebecca Hwa, Sebastian Riedel
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
2200–2209
Language:
URL:
https://aclanthology.org/D17-1234/
DOI:
10.18653/v1/D17-1234
Bibkey:
Cite (ACL):
Cheng Chang, Runzhe Yang, Lu Chen, Xiang Zhou, and Kai Yu. 2017. Affordable On-line Dialogue Policy Learning. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 2200–2209, Copenhagen, Denmark. Association for Computational Linguistics.
Cite (Informal):
Affordable On-line Dialogue Policy Learning (Chang et al., EMNLP 2017)
Copy Citation:
PDF:
https://aclanthology.org/D17-1234.pdf
Attachment:
 D17-1234.Attachment.pdf