%0 Conference Proceedings %T Diversifying Neural Dialogue Generation via Negative Distillation %A Li, Yiwei %A Feng, Shaoxiong %A Sun, Bin %A Li, Kan %Y Carpuat, Marine %Y de Marneffe, Marie-Catherine %Y Meza Ruiz, Ivan Vladimir %S Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies %D 2022 %8 July %I Association for Computational Linguistics %C Seattle, United States %F li-etal-2022-diversifying %X Generative dialogue models suffer badly from the generic response problem, limiting their applications to a few toy scenarios. Recently, an interesting approach, namely negative training, has been proposed to alleviate this problem by reminding the model not to generate high-frequency responses during training. However, its performance is hindered by two issues, ignoring low-frequency but generic responses and bringing low-frequency but meaningless responses. In this paper, we propose a novel negative training paradigm, called negative distillation, to keep the model away from the undesirable generic responses while avoiding the above problems. First, we introduce a negative teacher model that can produce query-wise generic responses, and then the student model is required to maximize the distance with multi-level negative knowledge. Empirical results show that our method outperforms previous negative training methods significantly. %R 10.18653/v1/2022.naacl-main.31 %U https://aclanthology.org/2022.naacl-main.31 %U https://doi.org/10.18653/v1/2022.naacl-main.31 %P 407-418