Alternating Recurrent Dialog Model with Large-scale Pre-trained Language Models

Qingyang Wu, Yichi Zhang, Yu Li, Zhou Yu


Abstract
Existing dialog system models require extensive human annotations and are difficult to generalize to different tasks. The recent success of large pre-trained language models such as BERT and GPT-2 (Devlin et al., 2019; Radford et al., 2019) have suggested the effectiveness of incorporating language priors in down-stream NLP tasks. However, how much pre-trained language models can help dialog response generation is still under exploration. In this paper, we propose a simple, general, and effective framework: Alternating Recurrent Dialog Model (ARDM). ARDM models each speaker separately and takes advantage of the large pre-trained language model. It requires no supervision from human annotations such as belief states or dialog acts to achieve effective conversations. ARDM outperforms or is on par with state-of-the-art methods on two popular task-oriented dialog datasets: CamRest676 and MultiWOZ. Moreover, we can generalize ARDM to more challenging, non-collaborative tasks such as persuasion. In persuasion tasks, ARDM is capable of generating human-like responses to persuade people to donate to a charity.
Anthology ID:
2021.eacl-main.110
Volume:
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
Month:
April
Year:
2021
Address:
Online
Editors:
Paola Merlo, Jorg Tiedemann, Reut Tsarfaty
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1292–1301
Language:
URL:
https://aclanthology.org/2021.eacl-main.110
DOI:
10.18653/v1/2021.eacl-main.110
Bibkey:
Cite (ACL):
Qingyang Wu, Yichi Zhang, Yu Li, and Zhou Yu. 2021. Alternating Recurrent Dialog Model with Large-scale Pre-trained Language Models. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 1292–1301, Online. Association for Computational Linguistics.
Cite (Informal):
Alternating Recurrent Dialog Model with Large-scale Pre-trained Language Models (Wu et al., EACL 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.eacl-main.110.pdf
Code
 budzianowski/multiwoz
Data
MultiWOZ