A Versatile Adaptive Curriculum Learning Framework for Task-oriented Dialogue Policy Learning

Yang Zhao; Hua Qin; Wang Zhenyu; Changxi Zhu; Shihan Wang

doi:10.18653/v1/2022.findings-naacl.54

A Versatile Adaptive Curriculum Learning Framework for Task-oriented Dialogue Policy Learning

Yang Zhao, Hua Qin, Wang Zhenyu, Changxi Zhu, Shihan Wang

Abstract

Training a deep reinforcement learning-based dialogue policy with brute-force random sampling is costly. A new training paradigm was proposed to improve learning performance and efficiency by combining curriculum learning. However, attempts in the field of dialogue policy are very limited due to the lack of reliable evaluation of difficulty scores of dialogue tasks and the high sensitivity to the mode of progression through dialogue tasks. In this paper, we present a novel versatile adaptive curriculum learning (VACL) framework, which presents a substantial step toward applying automatic curriculum learning on dialogue policy tasks. It supports evaluating the difficulty of dialogue tasks only using the learning experiences of dialogue policy and skip-level selection according to their learning needs to maximize the learning efficiency. Moreover, an attractive feature of VACL is the construction of a generic, elastic global curriculum while training a good dialogue policy that could guide different dialogue policy learning without extra effort on re-training. The superiority and versatility of VACL are validated on three public dialogue datasets.

Anthology ID:: 2022.findings-naacl.54
Volume:: Findings of the Association for Computational Linguistics: NAACL 2022
Month:: July
Year:: 2022
Address:: Seattle, United States
Editors:: Marine Carpuat, Marie-Catherine de Marneffe, Ivan Vladimir Meza Ruiz
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 711–723
Language:
URL:: https://aclanthology.org/2022.findings-naacl.54
DOI:: 10.18653/v1/2022.findings-naacl.54
Bibkey:
Cite (ACL):: Yang Zhao, Hua Qin, Wang Zhenyu, Changxi Zhu, and Shihan Wang. 2022. A Versatile Adaptive Curriculum Learning Framework for Task-oriented Dialogue Policy Learning. In Findings of the Association for Computational Linguistics: NAACL 2022, pages 711–723, Seattle, United States. Association for Computational Linguistics.
Cite (Informal):: A Versatile Adaptive Curriculum Learning Framework for Task-oriented Dialogue Policy Learning (Zhao et al., Findings 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.findings-naacl.54.pdf
Video:: https://aclanthology.org/2022.findings-naacl.54.mp4

PDF Cite Search Video