Yinhe Zheng


pdf bib
MMChat: Multi-Modal Chat Dataset on Social Media
Yinhe Zheng | Guanyi Chen | Xin Liu | Jian Sun
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Incorporating multi-modal contexts in conversation is an important step for developing more engaging dialogue systems. In this work, we explore this direction by introducing MMChat: a large scale Chinese multi-modal dialogue corpus (32.4M raw dialogues and 120.84K filtered dialogues). Unlike previous corpora that are crowd-sourced or collected from fictitious movies, MMChat contains image-grounded dialogues collected from real conversations on social media, in which the sparsity issue is observed. Specifically, image-initiated dialogues in common communications may deviate to some non-image-grounded topics as the conversation proceeds. To better investigate this issue, we manually annotate 100K dialogues from MMChat and further filter the corpus accordingly, which yields MMChat-hf. We develop a benchmark model to address the sparsity issue in dialogue generation tasks by adapting the attention routing mechanism on image features. Experiments demonstrate the usefulness of incorporating image features and the effectiveness in handling the sparsity of image features.

pdf bib
LayerConnect: Hypernetwork-Assisted Inter-Layer Connector to Enhance Parameter Efficiency
Haoxiang Shi | Rongsheng Zhang | Jiaan Wang | Cen Wang | Yinhe Zheng | Tetsuya Sakai
Proceedings of the 29th International Conference on Computational Linguistics

Pre-trained Language Models (PLMs) are the cornerstone of the modern Natural Language Processing (NLP). However, as PLMs become heavier, fine tuning all their parameters loses their efficiency. Existing parameter-efficient methods generally focus on reducing the trainable parameters in PLMs but neglect the inference speed, which limits the ability to deploy PLMs. In this paper, we propose LayerConnect (hypernetwork-assisted inter-layer connectors) to enhance inference efficiency. Specifically, a light-weight connector with a linear structure is inserted between two Transformer layers, and the parameters inside each connector are tuned by a hypernetwork comprising an interpolator and a down-sampler. We perform extensive experiments on the widely used the GLUE benchmark. The experimental results verify the inference efficiency of our model. Compared to Adapter, our model parameters are reduced to approximately 11.75%, while the performance degradation is kept to less than 5% (2.5 points on average).

pdf bib
Improving Meta-learning for Low-resource Text Classification and Generation via Memory Imitation
Yingxiu Zhao | Zhiliang Tian | Huaxiu Yao | Yinhe Zheng | Dongkyu Lee | Yiping Song | Jian Sun | Nevin Zhang
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Building models of natural language processing (NLP) is challenging in low-resource scenarios where limited data are available. Optimization-based meta-learning algorithms achieve promising results in low-resource scenarios by adapting a well-generalized model initialization to handle new tasks. Nonetheless, these approaches suffer from the memorization overfitting issue, where the model tends to memorize the meta-training tasks while ignoring support sets when adapting to new tasks. To address this issue, we propose a memory imitation meta-learning (MemIML) method that enhances the model’s reliance on support sets for task adaptation. Specifically, we introduce a task-specific memory module to store support set information and construct an imitation module to force query sets to imitate the behaviors of support sets stored in the memory. A theoretical analysis is provided to prove the effectiveness of our method, and empirical results also demonstrate that our method outperforms competitive baselines on both text classification and generation tasks.

pdf bib
Rethinking and Refining the Distinct Metric
Siyang Liu | Sahand Sabour | Yinhe Zheng | Pei Ke | Xiaoyan Zhu | Minlie Huang
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Distinct is a widely used automatic metric for evaluating diversity in language generation tasks.However, we observed that the original approach to calculating distinct scores has evident biases that tend to assign higher penalties to longer sequences. We refine the calculation of distinct scores by scaling the number of distinct tokens based on their expectations. We provide both empirical and theoretical evidence to show that our method effectively removes the biases existing in the original distinct score. Our experiments show that our proposed metric, Expectation-Adjusted Distinct (EAD), correlates better with human judgment in evaluating response diversity.To assist future research, we provide an example implementation at https://github.com/lsy641/Expectation-Adjusted-Distinct.


pdf bib
Transferable Persona-Grounded Dialogues via Grounded Minimal Edits
Chen Henry Wu | Yinhe Zheng | Xiaoxi Mao | Minlie Huang
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Grounded dialogue models generate responses that are grounded on certain concepts. Limited by the distribution of grounded dialogue data, models trained on such data face the transferability challenges in terms of the data distribution and the type of grounded concepts. To address the challenges, we propose the grounded minimal editing framework, which minimally edits existing responses to be grounded on the given concept. Focusing on personas, we propose Grounded Minimal Editor (GME), which learns to edit by disentangling and recombining persona-related and persona-agnostic parts of the response. To evaluate persona-grounded minimal editing, we present the PersonaMi-nEdit dataset, and experimental results show that GME outperforms competitive baselines by a large margin. To evaluate the transferability, we experiment on the test set of BlendedSkillTalk and show that GME can edit dialogue models’ responses to largely improve their persona consistency while preserving the use of knowledge and empathy.

pdf bib
Diversifying Dialog Generation via Adaptive Label Smoothing
Yida Wang | Yinhe Zheng | Yong Jiang | Minlie Huang
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Neural dialogue generation models trained with the one-hot target distribution suffer from the over-confidence issue, which leads to poor generation diversity as widely reported in the literature. Although existing approaches such as label smoothing can alleviate this issue, they fail to adapt to diverse dialog contexts. In this paper, we propose an Adaptive Label Smoothing (AdaLabel) approach that can adaptively estimate a target label distribution at each time step for different contexts. The maximum probability in the predicted distribution is used to modify the soft target distribution produced by a novel light-weight bi-directional decoder module. The resulting target distribution is aware of both previous and future contexts and is adjusted to avoid over-training the dialogue model. Our model can be trained in an endto-end manner. Extensive experiments on two benchmark datasets show that our approach outperforms various competitive baselines in producing diverse responses.


pdf bib
Dialogue Distillation: Open-Domain Dialogue Augmentation Using Unpaired Data
Rongsheng Zhang | Yinhe Zheng | Jianzhi Shao | Xiaoxi Mao | Yadong Xi | Minlie Huang
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Recent advances in open-domain dialogue systems rely on the success of neural models that are trained on large-scale data. However, collecting large-scale dialogue data is usually time-consuming and labor-intensive. To address this data dilemma, we propose a novel data augmentation method for training open-domain dialogue models by utilizing unpaired data. Specifically, a data-level distillation process is first proposed to construct augmented dialogues where both post and response are retrieved from the unpaired data. A ranking module is employed to filter out low-quality dialogues. Further, a model-level distillation process is employed to distill a teacher model trained on high-quality paired data to augmented dialogue pairs, thereby preventing dialogue models from being affected by the noise in the augmented data. Automatic and manual evaluation indicates that our method can produce high-quality dialogue pairs with diverse contents, and the proposed data-level and model-level dialogue distillation can improve the performance of competitive baselines.

pdf bib
Listener’s Social Identity Matters in Personalised Response Generation
Guanyi Chen | Yinhe Zheng | Yupei Du
Proceedings of the 13th International Conference on Natural Language Generation

Personalised response generation enables generating human-like responses by means of assigning the generator a social identity. However, pragmatics theory suggests that human beings adjust the way of speaking based on not only who they are but also whom they are talking to. In other words, when modelling personalised dialogues, it might be favourable if we also take the listener’s social identity into consideration. To validate this idea, we use gender as a typical example of a social variable to investigate how the listener’s identity influences the language used in Chinese dialogues on social media. Also, we build personalised generators. The experiment results demonstrate that the listener’s identity indeed matters in the language use of responses and that the response generator can capture such differences in language use. More interestingly, by additionally modelling the listener’s identity, the personalised response generator performs better in its own identity.


pdf bib
Quantifying Context Overlap for Training Word Embeddings
Yimeng Zhuang | Jinghui Xie | Yinhe Zheng | Xuan Zhu
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Most models for learning word embeddings are trained based on the context information of words, more precisely first order co-occurrence relations. In this paper, a metric is designed to estimate second order co-occurrence relations based on context overlap. The estimated values are further used as the augmented data to enhance the learning of word embeddings by joint training with existing neural word embedding models. Experimental results show that better word vectors can be obtained for word similarity tasks and some downstream NLP tasks by the enhanced approach.