Zheng-Yu Niu

Also published as: Zheng Yu Niu, Zhengyu Niu


2022

pdf bib
Where to Go for the Holidays: Towards Mixed-Type Dialogs for Clarification of User Goals
Zeming Liu | Jun Xu | Zeyang Lei | Haifeng Wang | Zheng-Yu Niu | Hua Wu
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Most dialog systems posit that users have figured out clear and specific goals before starting an interaction. For example, users have determined the departure, the destination, and the travel time for booking a flight. However, in many scenarios, limited by experience and knowledge, users may know what they need, but still struggle to figure out clear and specific goals by determining all the necessary slots. In this paper, we identify this challenge, and make a step forward by collecting a new human-to-human mixed-type dialog corpus. It contains 5k dialog sessions and 168k utterances for 4 dialog types and 5 domains. Within each session, an agent first provides user-goal-related knowledge to help figure out clear and specific goals, and then help achieve them. Furthermore, we propose a mixed-type dialog model with a novel Prompt-based continual learning mechanism. Specifically, the mechanism enables the model to continually strengthen its ability on any specific type by utilizing existing dialog corpora effectively.

pdf bib
Long Time No See! Open-Domain Conversation with Long-Term Persona Memory
Xinchao Xu | Zhibin Gou | Wenquan Wu | Zheng-Yu Niu | Hua Wu | Haifeng Wang | Shihang Wang
Findings of the Association for Computational Linguistics: ACL 2022

Most of the open-domain dialogue models tend to perform poorly in the setting of long-term human-bot conversations. The possible reason is that they lack the capability of understanding and memorizing long-term dialogue history information. To address this issue, we present a novel task of Long-term Memory Conversation (LeMon) and then build a new dialogue dataset DuLeMon and a dialogue generation framework with Long-Term Memory (LTM) mechanism (called PLATO-LTM). This LTM mechanism enables our system to accurately extract and continuously update long-term persona memory without requiring multiple-session dialogue datasets for model training. To our knowledge, this is the first attempt to conduct real-time dynamic management of persona information of both parties, including the user and the bot. Results on DuLeMon indicate that PLATO-LTM can significantly outperform baselines in terms of long-term dialogue consistency, leading to better dialogue engagingness.

pdf bib
PLATO-XL: Exploring the Large-scale Pre-training of Dialogue Generation
Siqi Bao | Huang He | Fan Wang | Hua Wu | Haifeng Wang | Wenquan Wu | Zhihua Wu | Zhen Guo | Hua Lu | Xinxian Huang | Xin Tian | Xinchao Xu | Yingzhan Lin | Zheng-Yu Niu
Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022

To explore the limit of dialogue generation pre-training, we present the models of PLATO-XL with up to 11 billion parameters, trained on both Chinese and English social media conversations. To train such large models, we adopt the architecture of unified transformer with high computation and parameter efficiency. In addition, we carry out multi-party aware pre-training to better distinguish the characteristic information in social media conversations. With such designs, PLATO-XL successfully achieves superior performances as compared to other approaches in both Chinese and English chitchat. We further explore the capacity of PLATO-XL on other conversational tasks, such as knowledge grounded dialogue and task-oriented conversation. The experimental results indicate that PLATO-XL obtains state-of-the-art results across multiple conversational tasks, verifying its potential as a foundation model of conversational AI.

2021

pdf bib
Discovering Dialog Structure Graph for Coherent Dialog Generation
Jun Xu | Zeyang Lei | Haifeng Wang | Zheng-Yu Niu | Hua Wu | Wanxiang Che
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Learning discrete dialog structure graph from human-human dialogs yields basic insights into the structure of conversation, and also provides background knowledge to facilitate dialog generation. However, this problem is less studied in open-domain dialogue. In this paper, we conduct unsupervised discovery of discrete dialog structure from chitchat corpora, and then leverage it to facilitate coherent dialog generation in downstream systems. To this end, we present an unsupervised model, Discrete Variational Auto-Encoder with Graph Neural Network (DVAE-GNN), to discover discrete hierarchical latent dialog states (at the level of both session and utterance) and their transitions from corpus as a dialog structure graph. Then we leverage it as background knowledge to facilitate dialog management in a RL based dialog system. Experimental results on two benchmark corpora confirm that DVAE-GNN can discover meaningful dialog structure graph, and the use of dialog structure as background knowledge can significantly improve multi-turn coherence.

pdf bib
DuRecDial 2.0: A Bilingual Parallel Corpus for Conversational Recommendation
Zeming Liu | Haifeng Wang | Zheng-Yu Niu | Hua Wu | Wanxiang Che
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

In this paper, we provide a bilingual parallel human-to-human recommendation dialog dataset (DuRecDial 2.0) to enable researchers to explore a challenging task of multilingual and cross-lingual conversational recommendation. The difference between DuRecDial 2.0 and existing conversational recommendation datasets is that the data item (Profile, Goal, Knowledge, Context, Response) in DuRecDial 2.0 is annotated in two languages, both English and Chinese, while other datasets are built with the setting of a single language. We collect 8.2k dialogs aligned across English and Chinese languages (16.5k dialogs and 255k utterances in total) that are annotated by crowdsourced workers with strict quality control procedure. We then build monolingual, multilingual, and cross-lingual conversational recommendation baselines on DuRecDial 2.0. Experiment results show that the use of additional English data can bring performance improvement for Chinese conversational recommendation, indicating the benefits of DuRecDial 2.0. Finally, this dataset provides a challenging testbed for future studies of monolingual, multilingual, and cross-lingual conversational recommendation.

2020

pdf bib
Towards Conversational Recommendation over Multi-Type Dialogs
Zeming Liu | Haifeng Wang | Zheng-Yu Niu | Hua Wu | Wanxiang Che | Ting Liu
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

We focus on the study of conversational recommendation in the context of multi-type dialogs, where the bots can proactively and naturally lead a conversation from a non-recommendation dialog (e.g., QA) to a recommendation dialog, taking into account user’s interests and feedback. To facilitate the study of this task, we create a human-to-human Chinese dialog dataset DuRecDial (about 10k dialogs, 156k utterances), where there are multiple sequential dialogs for a pair of a recommendation seeker (user) and a recommender (bot). In each dialog, the recommender proactively leads a multi-type dialog to approach recommendation targets and then makes multiple recommendations with rich interaction behavior. This dataset allows us to systematically investigate different parts of the overall problem, e.g., how to naturally lead a dialog, how to interact with users for recommendation. Finally we establish baseline results on DuRecDial for future studies.

pdf bib
Conversational Graph Grounded Policy Learning for Open-Domain Conversation Generation
Jun Xu | Haifeng Wang | Zheng-Yu Niu | Hua Wu | Wanxiang Che | Ting Liu
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

To address the challenge of policy learning in open-domain multi-turn conversation, we propose to represent prior information about dialog transitions as a graph and learn a graph grounded dialog policy, aimed at fostering a more coherent and controllable dialog. To this end, we first construct a conversational graph (CG) from dialog corpora, in which there are vertices to represent “what to say” and “how to say”, and edges to represent natural transition between a message (the last utterance in a dialog context) and its response. We then present a novel CG grounded policy learning framework that conducts dialog flow planning by graph traversal, which learns to identify a what-vertex and a how-vertex from the CG at each turn to guide response generation. In this way, we effectively leverage the CG to facilitate policy learning as follows: (1) it enables more effective long-term reward design, (2) it provides high-quality candidate actions, and (3) it gives us more control over the policy. Results on two benchmark corpora demonstrate the effectiveness of this framework.

2019

pdf bib
Knowledge Aware Conversation Generation with Explainable Reasoning over Augmented Graphs
Zhibin Liu | Zheng-Yu Niu | Hua Wu | Haifeng Wang
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Two types of knowledge, triples from knowledge graphs and texts from documents, have been studied for knowledge aware open domain conversation generation, in which graph paths can narrow down vertex candidates for knowledge selection decision, and texts can provide rich information for response generation. Fusion of a knowledge graph and texts might yield mutually reinforcing advantages, but there is less study on that. To address this challenge, we propose a knowledge aware chatting machine with three components, an augmented knowledge graph with both triples and texts, knowledge selector, and knowledge aware response generator. For knowledge selection on the graph, we formulate it as a problem of multi-hop graph reasoning to effectively capture conversation flow, which is more explainable and flexible in comparison with previous works. To fully leverage long text information that differentiates our graph from others, we improve a state of the art reasoning algorithm with machine reading comprehension technology. We demonstrate the effectiveness of our system on two datasets in comparison with state-of-the-art models.

2017

pdf bib
Multi-task Attention-based Neural Networks for Implicit Discourse Relationship Representation and Identification
Man Lan | Jianxiang Wang | Yuanbin Wu | Zheng-Yu Niu | Haifeng Wang
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

We present a novel multi-task attention based neural network model to address implicit discourse relationship representation and identification through two types of representation learning, an attention based neural network for learning discourse relationship representation with two arguments and a multi-task framework for learning knowledge from annotated and unannotated corpora. The extensive experiments have been performed on two benchmark corpora (i.e., PDTB and CoNLL-2016 datasets). Experimental results show that our proposed model outperforms the state-of-the-art systems on benchmark corpora.

2013

pdf bib
Leveraging Synthetic Discourse Data via Multi-task Learning for Implicit Discourse Relation Recognition
Man Lan | Yu Xu | Zhengyu Niu
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
ECNUCS: Recognizing Cross-lingual Textual Entailment Using Multiple Text Similarity and Text Difference Measures
Jiang Zhao | Man Lan | Zheng-Yu Niu
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013)

2010

pdf bib
The Effects of Discourse Connectives Prediction on Implicit Discourse Relation Recognition
Zhi Min Zhou | Man Lan | Zheng Yu Niu | Yu Xu | Jian Su
Proceedings of the SIGDIAL 2010 Conference

pdf bib
Predicting Discourse Connectives for Implicit Discourse Relation Recognition
Zhi-Min Zhou | Yu Xu | Zheng-Yu Niu | Man Lan | Jian Su | Chew Lim Tan
Coling 2010: Posters

2009

pdf bib
Exploiting Heterogeneous Treebanks for Parsing
Zheng-Yu Niu | Haifeng Wang | Hua Wu
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

2008

pdf bib
The TCH machine translation system for IWSLT 2008.
Haifeng Wang | Hua Wu | Xiaoguang Hu | Zhanyi Liu | Jianfeng Li | Dengjun Ren | Zhengyu Niu
Proceedings of the 5th International Workshop on Spoken Language Translation: Evaluation Campaign

This paper reports on the first participation of TCH (Toshiba (China) Research and Development Center) at the IWSLT evaluation campaign. We participated in all the 5 translation tasks with Chinese as source language or target language. For Chinese-English and English-Chinese translation, we used hybrid systems that combine rule-based machine translation (RBMT) method and statistical machine translation (SMT) method. For Chinese-Spanish translation, phrase-based SMT models were used. For the pivot task, we combined the translations generated by a pivot based statistical translation model and a statistical transfer translation model (firstly, translating from Chinese to English, and then from English to Spanish). Moreover, for better performance of MT, we improved each module in the MT systems as follows: adapting Chinese word segmentation to spoken language translation, selecting out-of-domain corpus to build language models, using bilingual dictionaries to correct word alignment results, handling NE translation and selecting translations from the outputs of multiple systems. According to the automatic evaluation results on the full test sets, we top in all the 5 tasks.

2007

pdf bib
I2R: Three Systems for Word Sense Discrimination, Chinese Word Sense Disambiguation, and English Word Sense Disambiguation
Zheng-Yu Niu | Dong-Hong Ji | Chew-Lim Tan
Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)

2006

pdf bib
Relation Extraction Using Label Propagation Based Semi-Supervised Learning
Jinxiu Chen | Donghong Ji | Chew Lim Tan | Zhengyu Niu
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

pdf bib
Unsupervised Relation Disambiguation Using Spectral Clustering
Jinxiu Chen | Donghong Ji | Chew Lim Tan | Zhengyu Niu
Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions

pdf bib
Semi-supervised Relation Extraction with Label Propagation
Jinxiu Chen | Donghong Ji | Chew Lim Tan | Zhengyu Niu
Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers

pdf bib
Partially Supervised Sense Disambiguation by Learning Sense Number from Tagged and Untagged Corpora
Zheng-Yu Niu | Dong-Hong Ji | Chew Lim Tan
Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing

pdf bib
Unsupervised Relation Disambiguation with Order Identification Capabilities
Jinxiu Chen | Donghong Ji | Chew Lim Tan | Zhengyu Niu
Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing

2005

pdf bib
Automatic Relation Extraction with Model Order Selection and Discriminative Label Identification
Jinxiu Chen | Donghong Ji | Chew Lim Tan | Zhengyu Niu
Second International Joint Conference on Natural Language Processing: Full Papers

pdf bib
Unsupervised Feature Selection for Relation Extraction
Jinxiu Chen | Donghong Ji | Chew Lim Tan | Zhengyu Niu
Companion Volume to the Proceedings of Conference including Posters/Demos and tutorial abstracts

pdf bib
Word Sense Disambiguation Using Label Propagation Based Semi-Supervised Learning
Zheng-Yu Niu | Dong-Hong Ji | Chew Lim Tan
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05)

pdf bib
A Semi-Supervised Feature Clustering Algorithm with Application to Word Sense Disambiguation
Zheng-Yu Niu | Dong-Hong Ji | Chew Lim Tan
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing

2004

pdf bib
Learning Word Sense With Feature Selection and Order Identification Capabilities
Zheng-Yu Niu | Dong-Hong Ji | Chew-Lim Tan
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)

pdf bib
Optimizing feature set for Chinese Word Sense Disambiguation
Zheng-Yu Niu | Dong-Hong Ji | Chew-Lim Tan
Proceedings of SENSEVAL-3, the Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text