A Generative User Simulator with GPT-based Architecture and Goal State Tracking for Reinforced Multi-Domain Dialog Systems

Hong Liu, Yucheng Cai, Zhijian Ou, Yi Huang, Junlan Feng


Abstract
Building user simulators (USs) for reinforcement learning (RL) of task-oriented dialog systems (DSs) has gained more and more attention, which, however, still faces several fundamental challenges. First, it is unclear whether we can leverage pretrained language models to design, for example, GPT-2 based USs, to catch up and interact with the recently advanced GPT- 2 based DSs. Second, an important ingredient in a US is that the user goal can be effectively incorporated and tracked; but how to flexibly integrate goal state tracking and develop an end-to-end trainable US for multi-domains has remained to be a challenge. In this work, we propose a generative user simulator (GUS) with GPT-2 based architecture and goal state tracking towards addressing the above two challenges. Extensive experiments are conducted on MultiWOZ2.1. Different DSs are trained via RL with GUS, the classic agenda-based user simulator (ABUS) and other ablation simulators respectively, and are compared for crossmodel evaluation, corpus-based evaluation and human evaluation. The GUS achieves superior results in all three evaluation tasks.
Anthology ID:
2022.seretod-1.10
Volume:
Proceedings of the Towards Semi-Supervised and Reinforced Task-Oriented Dialog Systems (SereTOD)
Month:
December
Year:
2022
Address:
Abu Dhabi, Beijing (Hybrid)
Editors:
Zhijian Ou, Junlan Feng, Juanzi Li
Venue:
SereTOD
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
85–97
Language:
URL:
https://aclanthology.org/2022.seretod-1.10
DOI:
10.18653/v1/2022.seretod-1.10
Bibkey:
Cite (ACL):
Hong Liu, Yucheng Cai, Zhijian Ou, Yi Huang, and Junlan Feng. 2022. A Generative User Simulator with GPT-based Architecture and Goal State Tracking for Reinforced Multi-Domain Dialog Systems. In Proceedings of the Towards Semi-Supervised and Reinforced Task-Oriented Dialog Systems (SereTOD), pages 85–97, Abu Dhabi, Beijing (Hybrid). Association for Computational Linguistics.
Cite (Informal):
A Generative User Simulator with GPT-based Architecture and Goal State Tracking for Reinforced Multi-Domain Dialog Systems (Liu et al., SereTOD 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.seretod-1.10.pdf
Video:
 https://aclanthology.org/2022.seretod-1.10.mp4