Zheng-Yu Niu

Also published as: Zheng Yu Niu, Zheng-yu Niu, Zhengyu Niu


2023

pdf bib
XDailyDialog: A Multilingual Parallel Dialogue Corpus
Zeming Liu | Ping Nie | Jie Cai | Haifeng Wang | Zheng-Yu Niu | Peng Zhang | Mrinmaya Sachan | Kaiping Peng
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

High-quality datasets are significant to the development of dialogue models. However, most existing datasets for open-domain dialogue modeling are limited to a single language. The absence of multilingual open-domain dialog datasets not only limits the research on multilingual or cross-lingual transfer learning, but also hinders the development of robust open-domain dialog systems that can be deployed in other parts of the world. In this paper, we provide a multilingual parallel open-domain dialog dataset, XDailyDialog, to enable researchers to explore the challenging task of multilingual and cross-lingual open-domain dialog. XDailyDialog includes 13K dialogues aligned across 4 languages (52K dialogues and 410K utterances in total). We then propose a dialog generation model, kNN-Chat, which has a novel kNN-search mechanism to support unified response retrieval for monolingual, multilingual, and cross-lingual dialogue. Experiment results show the effectiveness of this framework. We will make XDailyDialog and kNN-Chat publicly available soon.

pdf bib
Towards Zero-Shot Persona Dialogue Generation with In-Context Learning
Xinchao Xu | Zeyang Lei | Wenquan Wu | Zheng-Yu Niu | Hua Wu | Haifeng Wang
Findings of the Association for Computational Linguistics: ACL 2023

Much work has been done to improve persona consistency by finetuning a pretrained dialogue model on high-quality human-annoated persona datasets. However, these methods still face the challenges of high cost and poor scalability. To this end, we propose a simple-yet-effective approach to significantly improve zero-shot persona consistency via in-context learning. Specifically, we first pre-train a persona-augmented dialogue generation model and then utilize in-context prompting mechanism to realize zero-shot persona customization. Experimental results demonstrate that our method can dramatically improve persona consistency without compromising coherence and informativeness in zero-shot settings.

2022

pdf bib
Where to Go for the Holidays: Towards Mixed-Type Dialogs for Clarification of User Goals
Zeming Liu | Jun Xu | Zeyang Lei | Haifeng Wang | Zheng-Yu Niu | Hua Wu
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Most dialog systems posit that users have figured out clear and specific goals before starting an interaction. For example, users have determined the departure, the destination, and the travel time for booking a flight. However, in many scenarios, limited by experience and knowledge, users may know what they need, but still struggle to figure out clear and specific goals by determining all the necessary slots. In this paper, we identify this challenge, and make a step forward by collecting a new human-to-human mixed-type dialog corpus. It contains 5k dialog sessions and 168k utterances for 4 dialog types and 5 domains. Within each session, an agent first provides user-goal-related knowledge to help figure out clear and specific goals, and then help achieve them. Furthermore, we propose a mixed-type dialog model with a novel Prompt-based continual learning mechanism. Specifically, the mechanism enables the model to continually strengthen its ability on any specific type by utilizing existing dialog corpora effectively.

pdf bib
CDConv: A Benchmark for Contradiction Detection in Chinese Conversations
Chujie Zheng | Jinfeng Zhou | Yinhe Zheng | Libiao Peng | Zhen Guo | Wenquan Wu | Zheng-Yu Niu | Hua Wu | Minlie Huang
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Dialogue contradiction is a critical issue in open-domain dialogue systems. The contextualization nature of conversations makes dialogue contradiction detection rather challenging. In this work, we propose a benchmark for Contradiction Detection in Chinese Conversations, namely CDConv. It contains 12K multi-turn conversations annotated with three typical contradiction categories: Intra-sentence Contradiction, Role Confusion, and History Contradiction. To efficiently construct the CDConv conversations, we devise a series of methods for automatic conversation generation, which simulate common user behaviors that trigger chatbots to make contradictions. We conduct careful manual quality screening of the constructed conversations and show that state-of-the-art Chinese chatbots can be easily goaded into making contradictions. Experiments on CDConv show that properly modeling contextual information is critical for dialogue contradiction detection, but there are still unresolved challenges that require future research.

pdf bib
PLATO-Ad: A Unified Advertisement Text Generation Framework with Multi-Task Prompt Learning
Zeyang Lei | Chao Zhang | Xinchao Xu | Wenquan Wu | Zheng-yu Niu | Hua Wu | Haifeng Wang | Yi Yang | Shuanglong Li
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Industry Track

Online advertisement text generation aims at generating attractive and persuasive text ads to appeal to users clicking ads or purchasing products. While pretraining-based models have achieved remarkable success in generating high-quality text ads, some challenges still remain, such as ad generation in low-resource scenarios and training efficiency for multiple ad tasks. In this paper, we propose a novel unified text ad generation framework with multi-task prompt learning, called PLATO-Ad, totackle these problems. Specifically, we design a three-phase transfer learning mechanism to tackle the low-resource ad generation problem. Furthermore, we present a novel multi-task prompt learning mechanism to efficiently utilize a single lightweight model to solve multiple ad generation tasks without loss of performance compared to training a separate model for each task. Finally, we conduct offline and online evaluations and experiment results show that PLATO-Ad significantly outperforms the state-of-the-art on both offline and online metrics. PLATO-Ad has been deployed in a leading advertising platform with 3.5% CTR improvement on search ad descriptions and 10.4% CTR improvement on feed ad titles.

pdf bib
Long Time No See! Open-Domain Conversation with Long-Term Persona Memory
Xinchao Xu | Zhibin Gou | Wenquan Wu | Zheng-Yu Niu | Hua Wu | Haifeng Wang | Shihang Wang
Findings of the Association for Computational Linguistics: ACL 2022

Most of the open-domain dialogue models tend to perform poorly in the setting of long-term human-bot conversations. The possible reason is that they lack the capability of understanding and memorizing long-term dialogue history information. To address this issue, we present a novel task of Long-term Memory Conversation (LeMon) and then build a new dialogue dataset DuLeMon and a dialogue generation framework with Long-Term Memory (LTM) mechanism (called PLATO-LTM). This LTM mechanism enables our system to accurately extract and continuously update long-term persona memory without requiring multiple-session dialogue datasets for model training. To our knowledge, this is the first attempt to conduct real-time dynamic management of persona information of both parties, including the user and the bot. Results on DuLeMon indicate that PLATO-LTM can significantly outperform baselines in terms of long-term dialogue consistency, leading to better dialogue engagingness.

pdf bib
PLATO-XL: Exploring the Large-scale Pre-training of Dialogue Generation
Siqi Bao | Huang He | Fan Wang | Hua Wu | Haifeng Wang | Wenquan Wu | Zhihua Wu | Zhen Guo | Hua Lu | Xinxian Huang | Xin Tian | Xinchao Xu | Yingzhan Lin | Zheng-Yu Niu
Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022

To explore the limit of dialogue generation pre-training, we present the models of PLATO-XL with up to 11 billion parameters, trained on both Chinese and English social media conversations. To train such large models, we adopt the architecture of unified transformer with high computation and parameter efficiency. In addition, we carry out multi-party aware pre-training to better distinguish the characteristic information in social media conversations. With such designs, PLATO-XL successfully achieves superior performances as compared to other approaches in both Chinese and English chitchat. We further explore the capacity of PLATO-XL on other conversational tasks, such as knowledge grounded dialogue and task-oriented conversation. The experimental results indicate that PLATO-XL obtains state-of-the-art results across multiple conversational tasks, verifying its potential as a foundation model of conversational AI.

2021

pdf bib
Discovering Dialog Structure Graph for Coherent Dialog Generation
Jun Xu | Zeyang Lei | Haifeng Wang | Zheng-Yu Niu | Hua Wu | Wanxiang Che
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Learning discrete dialog structure graph from human-human dialogs yields basic insights into the structure of conversation, and also provides background knowledge to facilitate dialog generation. However, this problem is less studied in open-domain dialogue. In this paper, we conduct unsupervised discovery of discrete dialog structure from chitchat corpora, and then leverage it to facilitate coherent dialog generation in downstream systems. To this end, we present an unsupervised model, Discrete Variational Auto-Encoder with Graph Neural Network (DVAE-GNN), to discover discrete hierarchical latent dialog states (at the level of both session and utterance) and their transitions from corpus as a dialog structure graph. Then we leverage it as background knowledge to facilitate dialog management in a RL based dialog system. Experimental results on two benchmark corpora confirm that DVAE-GNN can discover meaningful dialog structure graph, and the use of dialog structure as background knowledge can significantly improve multi-turn coherence.

pdf bib
DuRecDial 2.0: A Bilingual Parallel Corpus for Conversational Recommendation
Zeming Liu | Haifeng Wang | Zheng-Yu Niu | Hua Wu | Wanxiang Che
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

In this paper, we provide a bilingual parallel human-to-human recommendation dialog dataset (DuRecDial 2.0) to enable researchers to explore a challenging task of multilingual and cross-lingual conversational recommendation. The difference between DuRecDial 2.0 and existing conversational recommendation datasets is that the data item (Profile, Goal, Knowledge, Context, Response) in DuRecDial 2.0 is annotated in two languages, both English and Chinese, while other datasets are built with the setting of a single language. We collect 8.2k dialogs aligned across English and Chinese languages (16.5k dialogs and 255k utterances in total) that are annotated by crowdsourced workers with strict quality control procedure. We then build monolingual, multilingual, and cross-lingual conversational recommendation baselines on DuRecDial 2.0. Experiment results show that the use of additional English data can bring performance improvement for Chinese conversational recommendation, indicating the benefits of DuRecDial 2.0. Finally, this dataset provides a challenging testbed for future studies of monolingual, multilingual, and cross-lingual conversational recommendation.

2020

pdf bib
Towards Conversational Recommendation over Multi-Type Dialogs
Zeming Liu | Haifeng Wang | Zheng-Yu Niu | Hua Wu | Wanxiang Che | Ting Liu
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

We focus on the study of conversational recommendation in the context of multi-type dialogs, where the bots can proactively and naturally lead a conversation from a non-recommendation dialog (e.g., QA) to a recommendation dialog, taking into account user’s interests and feedback. To facilitate the study of this task, we create a human-to-human Chinese dialog dataset DuRecDial (about 10k dialogs, 156k utterances), where there are multiple sequential dialogs for a pair of a recommendation seeker (user) and a recommender (bot). In each dialog, the recommender proactively leads a multi-type dialog to approach recommendation targets and then makes multiple recommendations with rich interaction behavior. This dataset allows us to systematically investigate different parts of the overall problem, e.g., how to naturally lead a dialog, how to interact with users for recommendation. Finally we establish baseline results on DuRecDial for future studies.

pdf bib
Conversational Graph Grounded Policy Learning for Open-Domain Conversation Generation
Jun Xu | Haifeng Wang | Zheng-Yu Niu | Hua Wu | Wanxiang Che | Ting Liu
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

To address the challenge of policy learning in open-domain multi-turn conversation, we propose to represent prior information about dialog transitions as a graph and learn a graph grounded dialog policy, aimed at fostering a more coherent and controllable dialog. To this end, we first construct a conversational graph (CG) from dialog corpora, in which there are vertices to represent “what to say” and “how to say”, and edges to represent natural transition between a message (the last utterance in a dialog context) and its response. We then present a novel CG grounded policy learning framework that conducts dialog flow planning by graph traversal, which learns to identify a what-vertex and a how-vertex from the CG at each turn to guide response generation. In this way, we effectively leverage the CG to facilitate policy learning as follows: (1) it enables more effective long-term reward design, (2) it provides high-quality candidate actions, and (3) it gives us more control over the policy. Results on two benchmark corpora demonstrate the effectiveness of this framework.

2019

pdf bib
Knowledge Aware Conversation Generation with Explainable Reasoning over Augmented Graphs
Zhibin Liu | Zheng-Yu Niu | Hua Wu | Haifeng Wang
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Two types of knowledge, triples from knowledge graphs and texts from documents, have been studied for knowledge aware open domain conversation generation, in which graph paths can narrow down vertex candidates for knowledge selection decision, and texts can provide rich information for response generation. Fusion of a knowledge graph and texts might yield mutually reinforcing advantages, but there is less study on that. To address this challenge, we propose a knowledge aware chatting machine with three components, an augmented knowledge graph with both triples and texts, knowledge selector, and knowledge aware response generator. For knowledge selection on the graph, we formulate it as a problem of multi-hop graph reasoning to effectively capture conversation flow, which is more explainable and flexible in comparison with previous works. To fully leverage long text information that differentiates our graph from others, we improve a state of the art reasoning algorithm with machine reading comprehension technology. We demonstrate the effectiveness of our system on two datasets in comparison with state-of-the-art models.

2017

pdf bib
Multi-task Attention-based Neural Networks for Implicit Discourse Relationship Representation and Identification
Man Lan | Jianxiang Wang | Yuanbin Wu | Zheng-Yu Niu | Haifeng Wang
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

We present a novel multi-task attention based neural network model to address implicit discourse relationship representation and identification through two types of representation learning, an attention based neural network for learning discourse relationship representation with two arguments and a multi-task framework for learning knowledge from annotated and unannotated corpora. The extensive experiments have been performed on two benchmark corpora (i.e., PDTB and CoNLL-2016 datasets). Experimental results show that our proposed model outperforms the state-of-the-art systems on benchmark corpora.

2013

pdf bib
Leveraging Synthetic Discourse Data via Multi-task Learning for Implicit Discourse Relation Recognition
Man Lan | Yu Xu | Zhengyu Niu
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
ECNUCS: Recognizing Cross-lingual Textual Entailment Using Multiple Text Similarity and Text Difference Measures
Jiang Zhao | Man Lan | Zheng-Yu Niu
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013)

2010

pdf bib
The Effects of Discourse Connectives Prediction on Implicit Discourse Relation Recognition
Zhi Min Zhou | Man Lan | Zheng Yu Niu | Yu Xu | Jian Su
Proceedings of the SIGDIAL 2010 Conference

pdf bib
Predicting Discourse Connectives for Implicit Discourse Relation Recognition
Zhi-Min Zhou | Yu Xu | Zheng-Yu Niu | Man Lan | Jian Su | Chew Lim Tan
Coling 2010: Posters

2009

pdf bib
Exploiting Heterogeneous Treebanks for Parsing
Zheng-Yu Niu | Haifeng Wang | Hua Wu
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

2008

pdf bib
The TCH machine translation system for IWSLT 2008.
Haifeng Wang | Hua Wu | Xiaoguang Hu | Zhanyi Liu | Jianfeng Li | Dengjun Ren | Zhengyu Niu
Proceedings of the 5th International Workshop on Spoken Language Translation: Evaluation Campaign

This paper reports on the first participation of TCH (Toshiba (China) Research and Development Center) at the IWSLT evaluation campaign. We participated in all the 5 translation tasks with Chinese as source language or target language. For Chinese-English and English-Chinese translation, we used hybrid systems that combine rule-based machine translation (RBMT) method and statistical machine translation (SMT) method. For Chinese-Spanish translation, phrase-based SMT models were used. For the pivot task, we combined the translations generated by a pivot based statistical translation model and a statistical transfer translation model (firstly, translating from Chinese to English, and then from English to Spanish). Moreover, for better performance of MT, we improved each module in the MT systems as follows: adapting Chinese word segmentation to spoken language translation, selecting out-of-domain corpus to build language models, using bilingual dictionaries to correct word alignment results, handling NE translation and selecting translations from the outputs of multiple systems. According to the automatic evaluation results on the full test sets, we top in all the 5 tasks.

2007

pdf bib
I2R: Three Systems for Word Sense Discrimination, Chinese Word Sense Disambiguation, and English Word Sense Disambiguation
Zheng-Yu Niu | Dong-Hong Ji | Chew-Lim Tan
Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)

2006

pdf bib
Relation Extraction Using Label Propagation Based Semi-Supervised Learning
Jinxiu Chen | Donghong Ji | Chew Lim Tan | Zhengyu Niu
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

pdf bib
Unsupervised Relation Disambiguation Using Spectral Clustering
Jinxiu Chen | Donghong Ji | Chew Lim Tan | Zhengyu Niu
Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions

pdf bib
Semi-supervised Relation Extraction with Label Propagation
Jinxiu Chen | Donghong Ji | Chew Lim Tan | Zhengyu Niu
Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers

pdf bib
Partially Supervised Sense Disambiguation by Learning Sense Number from Tagged and Untagged Corpora
Zheng-Yu Niu | Dong-Hong Ji | Chew Lim Tan
Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing

pdf bib
Unsupervised Relation Disambiguation with Order Identification Capabilities
Jinxiu Chen | Donghong Ji | Chew Lim Tan | Zhengyu Niu
Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing

2005

pdf bib
Automatic Relation Extraction with Model Order Selection and Discriminative Label Identification
Jinxiu Chen | Donghong Ji | Chew Lim Tan | Zhengyu Niu
Second International Joint Conference on Natural Language Processing: Full Papers

pdf bib
Unsupervised Feature Selection for Relation Extraction
Jinxiu Chen | Donghong Ji | Chew Lim Tan | Zhengyu Niu
Companion Volume to the Proceedings of Conference including Posters/Demos and tutorial abstracts

pdf bib
A Semi-Supervised Feature Clustering Algorithm with Application to Word Sense Disambiguation
Zheng-Yu Niu | Dong-Hong Ji | Chew Lim Tan
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing

pdf bib
Word Sense Disambiguation Using Label Propagation Based Semi-Supervised Learning
Zheng-Yu Niu | Dong-Hong Ji | Chew Lim Tan
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05)

2004

pdf bib
Learning Word Sense With Feature Selection and Order Identification Capabilities
Zheng-Yu Niu | Dong-Hong Ji | Chew-Lim Tan
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)

pdf bib
Optimizing feature set for Chinese Word Sense Disambiguation
Zheng-Yu Niu | Dong-Hong Ji | Chew-Lim Tan
Proceedings of SENSEVAL-3, the Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text