Combining Curriculum Learning and Knowledge Distillation for Dialogue Generation

Qingqing Zhu, Xiuying Chen, Pengfei Wu, JunFei Liu, Dongyan Zhao


Abstract
Curriculum learning, a machine training strategy that feeds training instances to the model from easy to hard, has been proven to facilitate the dialogue generation task. Meanwhile, knowledge distillation, a knowledge transformation methodology among teachers and students networks can yield significant performance boost for student models. Hence, in this paper, we introduce a combination of curriculum learning and knowledge distillation for efficient dialogue generation models, where curriculum learning can help knowledge distillation from data and model aspects. To start with, from the data aspect, we cluster the training cases according to their complexity, which is calculated by various types of features such as sentence length and coherence between dialog pairs. Furthermore, we employ an adversarial training strategy to identify the complexity of cases from model level. The intuition is that, if a discriminator can tell the generated response is from the teacher or the student, then the case is difficult that the student model has not adapted to yet. Finally, we use self-paced learning, which is an extension to curriculum learning to assign weights for distillation. In conclusion, we arrange a hierarchical curriculum based on the above two aspects for the student model under the guidance from the teacher model. Experimental results demonstrate that our methods achieve improvements compared with competitive baselines.
Anthology ID:
2021.findings-emnlp.111
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2021
Month:
November
Year:
2021
Address:
Punta Cana, Dominican Republic
Editors:
Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:
Findings
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
1284–1295
Language:
URL:
https://aclanthology.org/2021.findings-emnlp.111
DOI:
10.18653/v1/2021.findings-emnlp.111
Bibkey:
Cite (ACL):
Qingqing Zhu, Xiuying Chen, Pengfei Wu, JunFei Liu, and Dongyan Zhao. 2021. Combining Curriculum Learning and Knowledge Distillation for Dialogue Generation. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 1284–1295, Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
Combining Curriculum Learning and Knowledge Distillation for Dialogue Generation (Zhu et al., Findings 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.findings-emnlp.111.pdf
Video:
 https://aclanthology.org/2021.findings-emnlp.111.mp4
Data
DailyDialog