Reward Shaping with Recurrent Neural Networks for Speeding up On-Line Policy Learning in Spoken Dialogue Systems

Reward Shaping with Recurrent Neural Networks for Speeding up On-Line Policy Learning in Spoken Dialogue Systems Pei-Hao Su author David Vandyke author Milica Gašić author Nikola Mrkšić author Tsung-Hsien Wen author Steve Young author 2015-09 text Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue Alexander Koller editor Gabriel Skantze editor Filip Jurcicek editor Masahiro Araki editor Carolyn Penstein Rose editor Association for Computational Linguistics Prague, Czech Republic conference publication su-etal-2015-reward 10.18653/v1/W15-4655 https://aclanthology.org/W15-4655/ 2015-09 417 421