Dialog Action-Aware Transformer for Dialog Policy Learning

Huimin Wang, Wai Chung Kwan, Kam-Fai Wong


Abstract
Recent works usually address Dialog policy learning DPL by training a reinforcement learning (RL) agent to determine the best dialog action. However, existing works on deep RL require a large volume of agent-user interactions to achieve acceptable performance. In this paper, we propose to make full use of the plain text knowledge from the pre-trained language model to accelerate the RL agent’s learning speed. Specifically, we design a dialog action-aware transformer encoder (DaTrans), which integrates a new fine-tuning procedure named masked last action task to encourage DaTrans to be dialog-aware and distill action-specific features. Then, DaTrans is further optimized in an RL setting with ongoing interactions and evolves through exploration in the dialog action space toward maximizing long-term accumulated rewards. The effectiveness and efficiency of the proposed model are demonstrated with both simulator evaluation and human evaluation.
Anthology ID:
2023.sigdial-1.12
Volume:
Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Month:
September
Year:
2023
Address:
Prague, Czechia
Editors:
Svetlana Stoyanchev, Shafiq Joty, David Schlangen, Ondrej Dusek, Casey Kennington, Malihe Alikhani
Venue:
SIGDIAL
SIG:
SIGDIAL
Publisher:
Association for Computational Linguistics
Note:
Pages:
142–148
Language:
URL:
https://aclanthology.org/2023.sigdial-1.12
DOI:
10.18653/v1/2023.sigdial-1.12
Bibkey:
Cite (ACL):
Huimin Wang, Wai Chung Kwan, and Kam-Fai Wong. 2023. Dialog Action-Aware Transformer for Dialog Policy Learning. In Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 142–148, Prague, Czechia. Association for Computational Linguistics.
Cite (Informal):
Dialog Action-Aware Transformer for Dialog Policy Learning (Wang et al., SIGDIAL 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.sigdial-1.12.pdf