Cao Liu


pdf bib
MT-Speech at SemEval-2022 Task 10: Incorporating Data Augmentation and Auxiliary Task with Cross-Lingual Pretrained Language Model for Structured Sentiment Analysis
Cong Chen | Jiansong Chen | Cao Liu | Fan Yang | Guanglu Wan | Jinxiong Xia
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)

Sentiment analysis is a fundamental task, and structure sentiment analysis (SSA) is an important component of sentiment analysis. However, traditional SSA is suffering from some important issues: (1) lack of interactive knowledge of different languages; (2) small amount of annotation data or even no annotation data. To address the above problems, we incorporate data augment and auxiliary tasks within a cross-lingual pretrained language model into SSA. Specifically, we employ XLM-Roberta to enhance mutually interactive information when parallel data is available in the pretraining stage. Furthermore, we leverage two data augment strategies and auxiliary tasks to improve the performance on few-label data and zero-shot cross-lingual settings. Experiments demonstrate the effectiveness of our models. Our models rank first on the cross-lingual sub-task and rank second on the monolingual sub-task of SemEval-2022 task 10.


pdf bib
Domain-Lifelong Learning for Dialogue State Tracking via Knowledge Preservation Networks
Qingbin Liu | Pengfei Cao | Cao Liu | Jiansong Chen | Xunliang Cai | Fan Yang | Shizhu He | Kang Liu | Jun Zhao
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Dialogue state tracking (DST), which estimates user goals given a dialogue context, is an essential component of task-oriented dialogue systems. Conventional DST models are usually trained offline, which requires a fixed dataset prepared in advance. This paradigm is often impractical in real-world applications since online dialogue systems usually involve continually emerging new data and domains. Therefore, this paper explores Domain-Lifelong Learning for Dialogue State Tracking (DLL-DST), which aims to continually train a DST model on new data to learn incessantly emerging new domains while avoiding catastrophically forgetting old learned domains. To this end, we propose a novel domain-lifelong learning method, called Knowledge Preservation Networks (KPN), which consists of multi-prototype enhanced retrospection and multi-strategy knowledge distillation, to solve the problems of expression diversity and combinatorial explosion in the DLL-DST task. Experimental results show that KPN effectively alleviates catastrophic forgetting and outperforms previous state-of-the-art lifelong learning methods by 4.25% and 8.27% of whole joint goal accuracy on the MultiWOZ benchmark and the SGD benchmark, respectively.


pdf bib
Incorporating Interlocutor-Aware Context into Response Generation on Multi-Party Chatbots
Cao Liu | Kang Liu | Shizhu He | Zaiqing Nie | Jun Zhao
Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

Conventional chatbots focus on two-party response generation, which simplifies the real dialogue scene. In this paper, we strive toward a novel task of Response Generation on Multi-Party Chatbot (RGMPC), where the generated responses heavily rely on the interlocutors’ roles (e.g., speaker and addressee) and their utterances. Unfortunately, complex interactions among the interlocutors’ roles make it challenging to precisely capture conversational contexts and interlocutors’ information. Facing this challenge, we present a response generation model which incorporates Interlocutor-aware Contexts into Recurrent Encoder-Decoder frameworks (ICRED) for RGMPC. Specifically, we employ interactive representations to capture dialogue contexts for different interlocutors. Moreover, we leverage an addressee memory to enhance contextual interlocutor information for the target addressee. Finally, we construct a corpus for RGMPC based on an existing open-access dataset. Automatic and manual evaluations demonstrate that the ICRED remarkably outperforms strong baselines.

pdf bib
Vocabulary Pyramid Network: Multi-Pass Encoding and Decoding with Multi-Level Vocabularies for Response Generation
Cao Liu | Shizhu He | Kang Liu | Jun Zhao
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

We study the task of response generation. Conventional methods employ a fixed vocabulary and one-pass decoding, which not only make them prone to safe and general responses but also lack further refining to the first generated raw sequence. To tackle the above two problems, we present a Vocabulary Pyramid Network (VPN) which is able to incorporate multi-pass encoding and decoding with multi-level vocabularies into response generation. Specifically, the dialogue input and output are represented by multi-level vocabularies which are obtained from hierarchical clustering of raw words. Then, multi-pass encoding and decoding are conducted on the multi-level vocabularies. Since VPN is able to leverage rich encoding and decoding information with multi-level vocabularies, it has the potential to generate better responses. Experiments on English Twitter and Chinese Weibo datasets demonstrate that VPN remarkably outperforms strong baselines.

pdf bib
Generating Questions for Knowledge Bases via Incorporating Diversified Contexts and Answer-Aware Loss
Cao Liu | Kang Liu | Shizhu He | Zaiqing Nie | Jun Zhao
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

We tackle the task of question generation over knowledge bases. Conventional methods for this task neglect two crucial research issues: 1) the given predicate needs to be expressed; 2) the answer to the generated question needs to be definitive. In this paper, we strive toward the above two issues via incorporating diversified contexts and answer-aware loss. Specifically, we propose a neural encoder-decoder model with multi-level copy mechanisms to generate such questions. Furthermore, the answer aware loss is introduced to make generated questions corresponding to more definitive answers. Experiments demonstrate that our model achieves state-of-the-art performance. Meanwhile, such generated question is able to express the given predicate and correspond to a definitive answer.


pdf bib
Generating Natural Answers by Incorporating Copying and Retrieving Mechanisms in Sequence-to-Sequence Learning
Shizhu He | Cao Liu | Kang Liu | Jun Zhao
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Generating answer with natural language sentence is very important in real-world question answering systems, which needs to obtain a right answer as well as a coherent natural response. In this paper, we propose an end-to-end question answering system called COREQA in sequence-to-sequence learning, which incorporates copying and retrieving mechanisms to generate natural answers within an encoder-decoder framework. Specifically, in COREQA, the semantic units (words, phrases and entities) in a natural answer are dynamically predicted from the vocabulary, copied from the given question and/or retrieved from the corresponding knowledge base jointly. Our empirical study on both synthetic and real-world datasets demonstrates the efficiency of COREQA, which is able to generate correct, coherent and natural answers for knowledge inquired questions.

pdf bib
IJCNLP-2017 Task 5: Multi-choice Question Answering in Examinations
Shangmin Guo | Kang Liu | Shizhu He | Cao Liu | Jun Zhao | Zhuoyu Wei
Proceedings of the IJCNLP 2017, Shared Tasks

The IJCNLP-2017 Multi-choice Question Answering(MCQA) task aims at exploring the performance of current Question Answering(QA) techniques via the realworld complex questions collected from Chinese Senior High School Entrance Examination papers and CK12 website1. The questions are all 4-way multi-choice questions writing in Chinese and English respectively that cover a wide range of subjects, e.g. Biology, History, Life Science and etc. And, all questions are restrained within the elementary and middle school level. During the whole procedure of this task, 7 teams submitted 323 runs in total. This paper describes the collected data, the format and size of these questions, formal run statistics and results, overview and performance statistics of different methods