Chujie Zheng


2022

pdf bib
On the Safety of Conversational Models: Taxonomy, Dataset, and Benchmark
Hao Sun | Guangxuan Xu | Jiawen Deng | Jiale Cheng | Chujie Zheng | Hao Zhou | Nanyun Peng | Xiaoyan Zhu | Minlie Huang
Findings of the Association for Computational Linguistics: ACL 2022

Dialogue safety problems severely limit the real-world deployment of neural conversational models and have attracted great research interests recently. However, dialogue safety problems remain under-defined and the corresponding dataset is scarce. We propose a taxonomy for dialogue safety specifically designed to capture unsafe behaviors in human-bot dialogue settings, with focuses on context-sensitive unsafety, which is under-explored in prior works. To spur research in this direction, we compile DiaSafety, a dataset with rich context-sensitive unsafe examples. Experiments show that existing safety guarding tools fail severely on our dataset. As a remedy, we train a dialogue safety classifier to provide a strong baseline for context-sensitive dialogue unsafety detection. With our classifier, we perform safety evaluations on popular conversational models and show that existing dialogue systems still exhibit concerning context-sensitive safety problems.

2021

pdf bib
Towards Emotional Support Dialog Systems
Siyang Liu | Chujie Zheng | Orianna Demasi | Sahand Sabour | Yu Li | Zhou Yu | Yong Jiang | Minlie Huang
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Emotional support is a crucial ability for many conversation scenarios, including social interactions, mental health support, and customer service chats. Following reasonable procedures and using various support skills can help to effectively provide support. However, due to the lack of a well-designed task and corpora of effective emotional support conversations, research on building emotional support into dialog systems remains lacking. In this paper, we define the Emotional Support Conversation (ESC) task and propose an ESC Framework, which is grounded on the Helping Skills Theory. We construct an Emotion Support Conversation dataset (ESConv) with rich annotation (especially support strategy) in a help-seeker and supporter mode. To ensure a corpus of high-quality conversations that provide examples of effective emotional support, we take extensive effort to design training tutorials for supporters and several mechanisms for quality control during data collection. Finally, we evaluate state-of-the-art dialog models with respect to the ability to provide emotional support. Our results show the importance of support strategies in providing effective emotional support and the utility of ESConv in training more emotional support systems.

pdf bib
CoMAE: A Multi-factor Hierarchical Framework for Empathetic Response Generation
Chujie Zheng | Yong Liu | Wei Chen | Yongcai Leng | Minlie Huang
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib
PsyQA: A Chinese Dataset for Generating Long Counseling Text for Mental Health Support
Hao Sun | Zhenru Lin | Chujie Zheng | Siyang Liu | Minlie Huang
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2020

pdf bib
KdConv: A Chinese Multi-domain Dialogue Dataset Towards Multi-turn Knowledge-driven Conversation
Hao Zhou | Chujie Zheng | Kaili Huang | Minlie Huang | Xiaoyan Zhu
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

The research of knowledge-driven conversational systems is largely limited due to the lack of dialog data which consists of multi-turn conversations on multiple topics and with knowledge annotations. In this paper, we propose a Chinese multi-domain knowledge-driven conversation dataset, KdConv, which grounds the topics in multi-turn conversations to knowledge graphs. Our corpus contains 4.5K conversations from three domains (film, music, and travel), and 86K utterances with an average turn number of 19.0. These conversations contain in-depth discussions on related topics and natural transition between multiple topics. To facilitate the following research on this corpus, we provide several benchmark models. Comparative results show that the models can be enhanced by introducing background knowledge, yet there is still a large space for leveraging knowledge to model multi-turn conversations for further research. Results also show that there are obvious performance differences between different domains, indicating that it is worth further explore transfer learning and domain adaptation. The corpus and benchmark models are publicly available.

pdf bib
Difference-aware Knowledge Selection for Knowledge-grounded Conversation Generation
Chujie Zheng | Yunbo Cao | Daxin Jiang | Minlie Huang
Findings of the Association for Computational Linguistics: EMNLP 2020

In a multi-turn knowledge-grounded dialog, the difference between the knowledge selected at different turns usually provides potential clues to knowledge selection, which has been largely neglected in previous research. In this paper, we propose a difference-aware knowledge selection method. It first computes the difference between the candidate knowledge sentences provided at the current turn and those chosen in the previous turns. Then, the differential information is fused with or disentangled from the contextual information to facilitate final knowledge selection. Automatic, human observational, and interactive evaluation shows that our method is able to select knowledge more accurately and generate more informative responses, significantly outperforming the state-of-the-art baselines.

2019

pdf bib
ChID: A Large-scale Chinese IDiom Dataset for Cloze Test
Chujie Zheng | Minlie Huang | Aixin Sun
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Cloze-style reading comprehension in Chinese is still limited due to the lack of various corpora. In this paper we propose a large-scale Chinese cloze test dataset ChID, which studies the comprehension of idiom, a unique language phenomenon in Chinese. In this corpus, the idioms in a passage are replaced by blank symbols and the correct answer needs to be chosen from well-designed candidate idioms. We carefully study how the design of candidate idioms and the representation of idioms affect the performance of state-of-the-art models. Results show that the machine accuracy is substantially worse than that of human, indicating a large space for further research.