A Collaborative Multi-agent Reinforcement Learning Framework for Dialog Action Decomposition

Huimin Wang, Kam-Fai Wong


Abstract
Most reinforcement learning methods for dialog policy learning train a centralized agent that selects a predefined joint action concatenating domain name, intent type, and slot name. The centralized dialog agent suffers from a great many user-agent interaction requirements due to the large action space. Besides, designing the concatenated actions is laborious to engineers and maybe struggled with edge cases. To solve these problems, we model the dialog policy learning problem with a novel multi-agent framework, in which each part of the action is led by a different agent. The framework reduces labor costs for action templates and decreases the size of the action space for each agent. Furthermore, we relieve the non-stationary problem caused by the changing dynamics of the environment as evolving of agents’ policies by introducing a joint optimization process that makes agents can exchange their policy information. Concurrently, an independent experience replay buffer mechanism is integrated to reduce the dependence between gradients of samples to improve training efficiency. The effectiveness of the proposed framework is demonstrated in a multi-domain environment with both user simulator evaluation and human evaluation.
Anthology ID:
2021.emnlp-main.621
Volume:
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2021
Address:
Online and Punta Cana, Dominican Republic
Editors:
Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7882–7889
Language:
URL:
https://aclanthology.org/2021.emnlp-main.621
DOI:
10.18653/v1/2021.emnlp-main.621
Bibkey:
Cite (ACL):
Huimin Wang and Kam-Fai Wong. 2021. A Collaborative Multi-agent Reinforcement Learning Framework for Dialog Action Decomposition. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 7882–7889, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
A Collaborative Multi-agent Reinforcement Learning Framework for Dialog Action Decomposition (Wang & Wong, EMNLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.emnlp-main.621.pdf
Video:
 https://aclanthology.org/2021.emnlp-main.621.mp4
Data
MultiWOZ