On-line Active Reward Learning for Policy Optimisation in Spoken Dialogue Systems

On-line Active Reward Learning for Policy Optimisation in Spoken Dialogue Systems Pei-Hao Su author Milica Gašić author Nikola Mrkšić author Lina M Rojas-Barahona author Stefan Ultes author David Vandyke author Tsung-Hsien Wen author Steve Young author 2016-08 text Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) Katrin Erk editor Noah A Smith editor Association for Computational Linguistics Berlin, Germany conference publication su-etal-2016-line 10.18653/v1/P16-1230 https://aclanthology.org/P16-1230/ 2016-08 2431 2441