A Model-agnostic Data Manipulation Method for Persona-based Dialogue Generation

Yu Cao, Wei Bi, Meng Fang, Shuming Shi, Dacheng Tao


Abstract
Towards building intelligent dialogue agents, there has been a growing interest in introducing explicit personas in generation models. However, with limited persona-based dialogue data at hand, it may be difficult to train a dialogue generation model well. We point out that the data challenges of this generation task lie in two aspects: first, it is expensive to scale up current persona-based dialogue datasets; second, each data sample in this task is more complex to learn with than conventional dialogue data. To alleviate the above data issues, we propose a data manipulation method, which is model-agnostic to be packed with any persona-based dialogue generation model to improve their performance. The original training samples will first be distilled and thus expected to be fitted more easily. Next, we show various effective ways that can diversify such easier distilled data. A given base model will then be trained via the constructed data curricula, i.e. first on augmented distilled samples and then on original ones. Experiments illustrate the superiority of our method with two strong base dialogue models (Transformer encoder-decoder and GPT2).
Anthology ID:
2022.acl-long.550
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7984–8002
Language:
URL:
https://aclanthology.org/2022.acl-long.550
DOI:
10.18653/v1/2022.acl-long.550
Bibkey:
Cite (ACL):
Yu Cao, Wei Bi, Meng Fang, Shuming Shi, and Dacheng Tao. 2022. A Model-agnostic Data Manipulation Method for Persona-based Dialogue Generation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 7984–8002, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
A Model-agnostic Data Manipulation Method for Persona-based Dialogue Generation (Cao et al., ACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.acl-long.550.pdf
Software:
 2022.acl-long.550.software.zip
Code
 caoyu-noob/d3