Xiaoyan Zhu


2021

pdf bib
A Semantic-based Method for Unsupervised Commonsense Question Answering
Yilin Niu | Fei Huang | Jiaming Liang | Wenkai Chen | Xiaoyan Zhu | Minlie Huang
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Unsupervised commonsense question answering is appealing since it does not rely on any labeled task data. Among existing work, a popular solution is to use pre-trained language models to score candidate choices directly conditioned on the question or context. However, such scores from language models can be easily affected by irrelevant factors, such as word frequencies, sentence structures, etc. These distracting factors may not only mislead the model to choose a wrong answer but also make it oversensitive to lexical perturbations in candidate answers. In this paper, we present a novel SEmantic-based Question Answering method (SEQA) for unsupervised commonsense question answering. Instead of directly scoring each answer choice, our method first generates a set of plausible answers with generative models (e.g., GPT-2), and then uses these plausible answers to select the correct choice by considering the semantic similarity between each plausible answer and each choice. We devise a simple, yet sound formalism for this idea and verify its effectiveness and robustness with extensive experiments. We evaluate the proposed method on four benchmark datasets, and our method achieves the best results in unsupervised settings. Moreover, when attacked by TextFooler with synonym replacement, SEQA demonstrates much less performance drops than baselines, thereby indicating stronger robustness.

pdf bib
NAST: A Non-Autoregressive Generator with Word Alignment for Unsupervised Text Style Transfer
Fei Huang | Zikai Chen | Chen Henry Wu | Qihan Guo | Xiaoyan Zhu | Minlie Huang
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib
JointGT: Graph-Text Joint Representation Learning for Text Generation from Knowledge Graphs
Pei Ke | Haozhe Ji | Yu Ran | Xin Cui | Liwei Wang | Linfeng Song | Xiaoyan Zhu | Minlie Huang
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2020

pdf bib
A Knowledge-Enhanced Pretraining Model for Commonsense Story Generation
Jian Guan | Fei Huang | Zhihao Zhao | Xiaoyan Zhu | Minlie Huang
Transactions of the Association for Computational Linguistics, Volume 8

Story generation, namely, generating a reasonable story from a leading context, is an important but challenging task. In spite of the success in modeling fluency and local coherence, existing neural language generation models (e.g., GPT-2) still suffer from repetition, logic conflicts, and lack of long-range coherence in generated stories. We conjecture that this is because of the difficulty of associating relevant commonsense knowledge, understanding the causal relationships, and planning entities and events with proper temporal order. In this paper, we devise a knowledge-enhanced pretraining model for commonsense story generation. We propose to utilize commonsense knowledge from external knowledge bases to generate reasonable stories. To further capture the causal and temporal dependencies between the sentences in a reasonable story, we use multi-task learning, which combines a discriminative objective to distinguish true and fake stories during fine-tuning. Automatic and manual evaluation shows that our model can generate more reasonable stories than state-of-the-art baselines, particularly in terms of logic and global coherence.

pdf bib
CrossWOZ: A Large-Scale Chinese Cross-Domain Task-Oriented Dialogue Dataset
Qi Zhu | Kaili Huang | Zheng Zhang | Xiaoyan Zhu | Minlie Huang
Transactions of the Association for Computational Linguistics, Volume 8

To advance multi-domain (cross-domain) dialogue modeling as well as alleviate the shortage of Chinese task-oriented datasets, we propose CrossWOZ, the first large-scale Chinese Cross-Domain Wizard-of-Oz task-oriented dataset. It contains 6K dialogue sessions and 102K utterances for 5 domains, including hotel, restaurant, attraction, metro, and taxi. Moreover, the corpus contains rich annotation of dialogue states and dialogue acts on both user and system sides. About 60% of the dialogues have cross-domain user goals that favor inter-domain dependency and encourage natural transition across domains in conversation. We also provide a user simulator and several benchmark models for pipelined task-oriented dialogue systems, which will facilitate researchers to compare and evaluate their models on this corpus. The large size and rich annotation of CrossWOZ make it suitable to investigate a variety of tasks in cross-domain dialogue modeling, such as dialogue state tracking, policy learning, user simulation, etc.

pdf bib
KdConv: A Chinese Multi-domain Dialogue Dataset Towards Multi-turn Knowledge-driven Conversation
Hao Zhou | Chujie Zheng | Kaili Huang | Minlie Huang | Xiaoyan Zhu
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

The research of knowledge-driven conversational systems is largely limited due to the lack of dialog data which consists of multi-turn conversations on multiple topics and with knowledge annotations. In this paper, we propose a Chinese multi-domain knowledge-driven conversation dataset, KdConv, which grounds the topics in multi-turn conversations to knowledge graphs. Our corpus contains 4.5K conversations from three domains (film, music, and travel), and 86K utterances with an average turn number of 19.0. These conversations contain in-depth discussions on related topics and natural transition between multiple topics. To facilitate the following research on this corpus, we provide several benchmark models. Comparative results show that the models can be enhanced by introducing background knowledge, yet there is still a large space for leveraging knowledge to model multi-turn conversations for further research. Results also show that there are obvious performance differences between different domains, indicating that it is worth further explore transfer learning and domain adaptation. The corpus and benchmark models are publicly available.

pdf bib
ConvLab-2: An Open-Source Toolkit for Building, Evaluating, and Diagnosing Dialogue Systems
Qi Zhu | Zheng Zhang | Yan Fang | Xiang Li | Ryuichi Takanobu | Jinchao Li | Baolin Peng | Jianfeng Gao | Xiaoyan Zhu | Minlie Huang
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

We present ConvLab-2, an open-source toolkit that enables researchers to build task-oriented dialogue systems with state-of-the-art models, perform an end-to-end evaluation, and diagnose the weakness of systems. As the successor of ConvLab, ConvLab-2 inherits ConvLab’s framework but integrates more powerful dialogue models and supports more datasets. Besides, we have developed an analysis tool and an interactive tool to assist researchers in diagnosing dialogue systems. The analysis tool presents rich statistics and summarizes common mistakes from simulated dialogues, which facilitates error analysis and system improvement. The interactive tool provides an user interface that allows developers to diagnose an assembled dialogue system by interacting with the system and modifying the output of each system component.

pdf bib
Language Generation with Multi-Hop Reasoning on Commonsense Knowledge Graph
Haozhe Ji | Pei Ke | Shaohan Huang | Furu Wei | Xiaoyan Zhu | Minlie Huang
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Despite the success of generative pre-trained language models on a series of text generation tasks, they still suffer in cases where reasoning over underlying commonsense knowledge is required during generation. Existing approaches that integrate commonsense knowledge into generative pre-trained language models simply transfer relational knowledge by post-training on individual knowledge triples while ignoring rich connections within the knowledge graph. We argue that exploiting both the structural and semantic information of the knowledge graph facilitates commonsense-aware text generation. In this paper, we propose Generation with Multi-Hop Reasoning Flow (GRF) that enables pre-trained models with dynamic multi-hop reasoning on multi-relational paths extracted from the external commonsense knowledge graph. We empirically show that our model outperforms existing baselines on three text generation tasks that require reasoning over commonsense knowledge. We also demonstrate the effectiveness of the dynamic multi-hop reasoning module with reasoning paths inferred by the model that provide rationale to the generation.

pdf bib
SentiLARE: Sentiment-Aware Language Representation Learning with Linguistic Knowledge
Pei Ke | Haozhe Ji | Siyang Liu | Xiaoyan Zhu | Minlie Huang
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Most of the existing pre-trained language representation models neglect to consider the linguistic knowledge of texts, which can promote language understanding in NLP tasks. To benefit the downstream tasks in sentiment analysis, we propose a novel language representation model called SentiLARE, which introduces word-level linguistic knowledge including part-of-speech tag and sentiment polarity (inferred from SentiWordNet) into pre-trained models. We first propose a context-aware sentiment attention mechanism to acquire the sentiment polarity of each word with its part-of-speech tag by querying SentiWordNet. Then, we devise a new pre-training task called label-aware masked language model to construct knowledge-aware language representation. Experiments show that SentiLARE obtains new state-of-the-art performance on a variety of sentiment analysis tasks.

pdf bib
Learning Goal-oriented Dialogue Policy with opposite Agent Awareness
Zheng Zhang | Lizi Liao | Xiaoyan Zhu | Tat-Seng Chua | Zitao Liu | Yan Huang | Minlie Huang
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing

Most existing approaches for goal-oriented dialogue policy learning used reinforcement learning, which focuses on the target agent policy and simply treats the opposite agent policy as part of the environment. While in real-world scenarios, the behavior of an opposite agent often exhibits certain patterns or underlies hidden policies, which can be inferred and utilized by the target agent to facilitate its own decision making. This strategy is common in human mental simulation by first imaging a specific action and the probable results before really acting it. We therefore propose an opposite behavior aware framework for policy learning in goal-oriented dialogues. We estimate the opposite agent’s policy from its behavior and use this estimation to improve the target agent by regarding it as part of the target policy. We evaluate our model on both cooperative and competitive dialogue tasks, showing superior performance over state-of-the-art baselines.

2019

pdf bib
Long and Diverse Text Generation with Planning-based Hierarchical Variational Model
Zhihong Shao | Minlie Huang | Jiangtao Wen | Wenfei Xu | Xiaoyan Zhu
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Existing neural methods for data-to-text generation are still struggling to produce long and diverse texts: they are insufficient to model input data dynamically during generation, to capture inter-sentence coherence, or to generate diversified expressions. To address these issues, we propose a Planning-based Hierarchical Variational Model (PHVM). Our model first plans a sequence of groups (each group is a subset of input items to be covered by a sentence) and then realizes each sentence conditioned on the planning result and the previously generated context, thereby decomposing long text generation into dependent sentence generation sub-tasks. To capture expression diversity, we devise a hierarchical latent structure where a global planning latent variable models the diversity of reasonable planning and a sequence of local latent variables controls sentence realization. Experiments show that our model outperforms state-of-the-art baselines in long and diverse text generation.

pdf bib
ARAML: A Stable Adversarial Training Framework for Text Generation
Pei Ke | Fei Huang | Minlie Huang | Xiaoyan Zhu
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Most of the existing generative adversarial networks (GAN) for text generation suffer from the instability of reinforcement learning training algorithms such as policy gradient, leading to unstable performance. To tackle this problem, we propose a novel framework called Adversarial Reward Augmented Maximum Likelihood (ARAML). During adversarial training, the discriminator assigns rewards to samples which are acquired from a stationary distribution near the data rather than the generator’s distribution. The generator is optimized with maximum likelihood estimation augmented by the discriminator’s rewards instead of policy gradient. Experiments show that our model can outperform state-of-the-art text GANs with a more stable training process.

2018

pdf bib
An Operation Network for Abstractive Sentence Compression
Naitong Yu | Jie Zhang | Minlie Huang | Xiaoyan Zhu
Proceedings of the 27th International Conference on Computational Linguistics

Sentence compression condenses a sentence while preserving its most important contents. Delete-based models have the strong ability to delete undesired words, while generate-based models are able to reorder or rephrase the words, which are more coherent to human sentence compression. In this paper, we propose Operation Network, a neural network approach for abstractive sentence compression, which combines the advantages of both delete-based and generate-based sentence compression models. The central idea of Operation Network is to model the sentence compression process as an editing procedure. First, unnecessary words are deleted from the source sentence, then new words are either generated from a large vocabulary or copied directly from the source sentence. A compressed sentence can be obtained by a series of such edit operations (delete, copy and generate). Experiments show that Operation Network outperforms state-of-the-art baselines.

pdf bib
An Interpretable Reasoning Network for Multi-Relation Question Answering
Mantong Zhou | Minlie Huang | Xiaoyan Zhu
Proceedings of the 27th International Conference on Computational Linguistics

Multi-relation Question Answering is a challenging task, due to the requirement of elaborated analysis on questions and reasoning over multiple fact triples in knowledge base. In this paper, we present a novel model called Interpretable Reasoning Network that employs an interpretable, hop-by-hop reasoning process for question answering. The model dynamically decides which part of an input question should be analyzed at each hop; predicts a relation that corresponds to the current parsed results; utilizes the predicted relation to update the question representation and the state of the reasoning process; and then drives the next-hop reasoning. Experiments show that our model yields state-of-the-art results on two datasets. More interestingly, the model can offer traceable and observable intermediate predictions for reasoning analysis and failure diagnosis, thereby allowing manual manipulation in predicting the final answer.

pdf bib
Generating Informative Responses with Controlled Sentence Function
Pei Ke | Jian Guan | Minlie Huang | Xiaoyan Zhu
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Sentence function is a significant factor to achieve the purpose of the speaker, which, however, has not been touched in large-scale conversation generation so far. In this paper, we present a model to generate informative responses with controlled sentence function. Our model utilizes a continuous latent variable to capture various word patterns that realize the expected sentence function, and introduces a type controller to deal with the compatibility of controlling sentence function and generating informative content. Conditioned on the latent variable, the type controller determines the type (i.e., function-related, topic, and ordinary word) of a word to be generated at each decoding position. Experiments show that our model outperforms state-of-the-art baselines, and it has the ability to generate responses with both controlled sentence function and informative content.

2017

pdf bib
Linguistically Regularized LSTM for Sentiment Classification
Qiao Qian | Minlie Huang | Jinhao Lei | Xiaoyan Zhu
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

This paper deals with sentence-level sentiment classification. Though a variety of neural network models have been proposed recently, however, previous models either depend on expensive phrase-level annotation, most of which has remarkably degraded performance when trained with only sentence-level annotation; or do not fully employ linguistic resources (e.g., sentiment lexicons, negation words, intensity words). In this paper, we propose simple models trained with sentence-level annotation, but also attempt to model the linguistic role of sentiment lexicons, negation words, and intensity words. Results show that our models are able to capture the linguistic role of sentiment words, negation words, and intensity words in sentiment expression.

2016

pdf bib
Attention-based LSTM for Aspect-level Sentiment Classification
Yequan Wang | Minlie Huang | Xiaoyan Zhu | Li Zhao
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
TransG : A Generative Model for Knowledge Graph Embedding
Han Xiao | Minlie Huang | Xiaoyan Zhu
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
GAKE: Graph Aware Knowledge Embedding
Jun Feng | Minlie Huang | Yang Yang | Xiaoyan Zhu
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Knowledge embedding, which projects triples in a given knowledge base to d-dimensional vectors, has attracted considerable research efforts recently. Most existing approaches treat the given knowledge base as a set of triplets, each of whose representation is then learned separately. However, as a fact, triples are connected and depend on each other. In this paper, we propose a graph aware knowledge embedding method (GAKE), which formulates knowledge base as a directed graph, and learns representations for any vertices or edges by leveraging the graph’s structural information. We introduce three types of graph context for embedding: neighbor context, path context, and edge context, each reflects properties of knowledge from different perspectives. We also design an attention mechanism to learn representative power of different vertices or edges. To validate our method, we conduct several experiments on two tasks. Experimental results suggest that our method outperforms several state-of-art knowledge embedding models.

pdf bib
Product Review Summarization by Exploiting Phrase Properties
Naitong Yu | Minlie Huang | Yuanyuan Shi | Xiaoyan Zhu
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

We propose a phrase-based approach for generating product review summaries. The main idea of our method is to leverage phrase properties to choose a subset of optimal phrases for generating the final summary. Specifically, we exploit two phrase properties, popularity and specificity. Popularity describes how popular the phrase is in the original reviews. Specificity describes how descriptive a phrase is in comparison to generic comments. We formalize the phrase selection procedure as an optimization problem and solve it using integer linear programming (ILP). An aspect-based bigram language model is used for generating the final summary with the selected phrases. Experiments show that our summarizer outperforms the other baselines.

pdf bib
Context-aware Natural Language Generation for Spoken Dialogue Systems
Hao Zhou | Minlie Huang | Xiaoyan Zhu
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Natural language generation (NLG) is an important component of question answering(QA) systems which has a significant impact on system quality. Most tranditional QA systems based on templates or rules tend to generate rigid and stylised responses without the natural variation of human language. Furthermore, such methods need an amount of work to generate the templates or rules. To address this problem, we propose a Context-Aware LSTM model for NLG. The model is completely driven by data without manual designed templates or rules. In addition, the context information, including the question to be answered, semantic values to be addressed in the response, and the dialogue act type during interaction, are well approached in the neural network model, which enables the model to produce variant and informative responses. The quantitative evaluation and human evaluation show that CA-LSTM obtains state-of-the-art performance.

2015

pdf bib
Learning Tag Embeddings and Tag-specific Composition Functions in Recursive Neural Network
Qiao Qian | Bo Tian | Minlie Huang | Yang Liu | Xuan Zhu | Xiaoyan Zhu
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

2014

pdf bib
Clustering Aspect-related Phrases by Leveraging Sentiment Distribution Consistency
Li Zhao | Minlie Huang | Haiqiang Chen | Junjun Cheng | Xiaoyan Zhu
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
New Word Detection for Sentiment Analysis
Minlie Huang | Borui Ye | Yichen Wang | Haiqiang Chen | Junjun Cheng | Xiaoyan Zhu
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2012

pdf bib
QA: from Turing Test to Intelligent Information Service
Xiaoyan Zhu
Proceedings of the Second CIPS-SIGHAN Joint Conference on Chinese Language Processing

pdf bib
Cross-Domain Co-Extraction of Sentiment and Topic Lexicons
Fangtao Li | Sinno Jialin Pan | Ou Jin | Qiang Yang | Xiaoyan Zhu
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
String Re-writing Kernel
Fan Bu | Hang Li | Xiaoyan Zhu
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2011

pdf bib
Quality-biased Ranking of Short Texts in Microblogging Services
Minlie Huang | Yi Yang | Xiaoyan Zhu
Proceedings of 5th International Joint Conference on Natural Language Processing

pdf bib
K2Q: Generating Natural Language Questions from Keywords with User Refinements
Zhicheng Zheng | Xiance Si | Edward Chang | Xiaoyan Zhu
Proceedings of 5th International Joint Conference on Natural Language Processing

2010

pdf bib
Learning to Link Entities with Knowledge Base
Zhicheng Zheng | Fangtao Li | Minlie Huang | Xiaoyan Zhu
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
Recognizing Biomedical Named Entities Using Skip-Chain Conditional Random Fields
Jingchen Liu | Minlie Huang | Xiaoyan Zhu
Proceedings of the 2010 Workshop on Biomedical Natural Language Processing

pdf bib
Function-Based Question Classification for General QA
Fan Bu | Xingwei Zhu | Yu Hao | Xiaoyan Zhu
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

pdf bib
Measuring the Non-compositionality of Multiword Expressions
Fan Bu | Xiaoyan Zhu | Ming Li
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf bib
Structure-Aware Review Mining and Summarization
Fangtao Li | Chao Han | Minlie Huang | Xiaoyan Zhu | Ying-Ju Xia | Shu Zhang | Hao Yu
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf bib
A Comparative Study on Ranking and Selection Strategies for Multi-Document Summarization
Feng Jin | Minlie Huang | Xiaoyan Zhu
Coling 2010: Posters

pdf bib
A Review Selection Approach for Accurate Feature Rating Estimation
Chong Long | Jie Zhang | Xiaoyan Zhu
Coling 2010: Posters

2009

pdf bib
Answering Opinion Questions with Random Walks on Graphs
Fangtao Li | Yang Tang | Minlie Huang | Xiaoyan Zhu
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

pdf bib
Towards Automatic Generation of Gene Summary
Feng Jin | Minlie Huang | Zhiyong Lu | Xiaoyan Zhu
Proceedings of the BioNLP 2009 Workshop

2008

pdf bib
Answer Validation by Information Distance Calculation
Fangtao Li | Xian Zhang | Xiaoyan Zhu
Coling 2008: Proceedings of the 2nd workshop on Information Retrieval for Question Answering

pdf bib
Classifying What-Type Questions by Head Noun Tagging
Fangtao Li | Xian Zhang | Jinhui Yuan | Xiaoyan Zhu
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

pdf bib
Using Conditional Random Fields to Extract Contexts and Answers of Questions from Online Forums
Shilin Ding | Gao Cong | Chin-Yew Lin | Xiaoyan Zhu
Proceedings of ACL-08: HLT

2004

pdf bib
Discovering Patterns to Extract Protein-Protein Interactions from Full Biomedical Texts
Minlie Huang | Xiaoyan Zhu | Donald G. Payan | Kunbin Qu | Ming Li
Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications (NLPBA/BioNLP)