Wei Chu


pdf bib
Keywords and Instances: A Hierarchical Contrastive Learning Framework Unifying Hybrid Granularities for Text Generation
Mingzhe Li | XieXiong Lin | Xiuying Chen | Jinxiong Chang | Qishen Zhang | Feng Wang | Taifeng Wang | Zhongyi Liu | Wei Chu | Dongyan Zhao | Rui Yan
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Contrastive learning has achieved impressive success in generation tasks to militate the “exposure bias” problem and discriminatively exploit the different quality of references. Existing works mostly focus on contrastive learning on the instance-level without discriminating the contribution of each word, while keywords are the gist of the text and dominant the constrained mapping relationships. Hence, in this work, we propose a hierarchical contrastive learning mechanism, which can unify hybrid granularities semantic meaning in the input text. Concretely, we first propose a keyword graph via contrastive correlations of positive-negative pairs to iteratively polish the keyword representations. Then, we construct intra-contrasts within instance-level and keyword-level, where we assume words are sampled nodes from a sentence distribution. Finally, to bridge the gap between independent contrast levels and tackle the common contrast vanishing problem, we propose an inter-contrast mechanism that measures the discrepancy between contrastive keyword nodes respectively to the instance distribution. Experiments demonstrate that our model outperforms competitive baselines on paraphrasing, dialogue generation, and storytelling tasks.

pdf bib
Extracting Trigger-sharing Events via an Event Matrix
Jun Xu | Weidi Xu | Mengshu Sun | Taifeng Wang | Wei Chu
Findings of the Association for Computational Linguistics: EMNLP 2022

A growing interest emerges in event extraction which aims to extract multiple events with triggers and arguments. Previous methods mitigate the problem of multiple events extraction by predicting the arguments conditioned on the event trigger and event type, assuming that these arguments belong to a single event. However, the assumption is invalid in general as there may be multiple events. Therefore, we present a unified framework called MatEE for trigger-sharing events extraction. It resolves the kernel bottleneck by effectively modeling the relations between arguments by an event matrix, where trigger-sharing events are represented by multiple cliques. We verify the proposed method on 3 widely-used benchmark datasets of event extraction. The experimental results show that it beats all the advanced competitors, significantly improving the state-of-the-art performances in event extraction.


pdf bib
PairRE: Knowledge Graph Embeddings via Paired Relation Vectors
Linlin Chao | Jianshan He | Taifeng Wang | Wei Chu
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Distance based knowledge graph embedding methods show promising results on link prediction task, on which two topics have been widely studied: one is the ability to handle complex relations, such as N-to-1, 1-to-N and N-to-N, the other is to encode various relation patterns, such as symmetry/antisymmetry. However, the existing methods fail to solve these two problems at the same time, which leads to unsatisfactory results. To mitigate this problem, we propose PairRE, a model with paired vectors for each relation representation. The paired vectors enable an adaptive adjustment of the margin in loss function to fit for different complex relations. Besides, PairRE is capable of encoding three important relation patterns, symmetry/antisymmetry, inverse and composition. Given simple constraints on relation representations, PairRE can encode subrelation further. Experiments on link prediction benchmarks demonstrate the proposed key capabilities of PairRE. Moreover, We set a new state-of-the-art on two knowledge graph datasets of the challenging Open Graph Benchmark.


pdf bib
Towards Fast and Accurate Neural Chinese Word Segmentation with Multi-Criteria Learning
Weipeng Huang | Xingyi Cheng | Kunlong Chen | Taifeng Wang | Wei Chu
Proceedings of the 28th International Conference on Computational Linguistics

The ambiguous annotation criteria lead to divergence of Chinese Word Segmentation (CWS) datasets in various granularities. Multi-criteria Chinese word segmentation aims to capture various annotation criteria among datasets and leverage their common underlying knowledge. In this paper, we propose a domain adaptive segmenter to exploit diverse criteria of various datasets. Our model is based on Bidirectional Encoder Representations from Transformers (BERT), which is responsible for introducing open-domain knowledge. Private and shared projection layers are proposed to capture domain-specific knowledge and common knowledge, respectively. We also optimize computational efficiency via distillation, quantization, and compiler optimization. Experiments show that our segmenter outperforms the previous state of the art (SOTA) models on 10 CWS datasets with superior efficiency.

pdf bib
Generating Informative Conversational Response using Recurrent Knowledge-Interaction and Knowledge-Copy
Xiexiong Lin | Weiyu Jian | Jianshan He | Taifeng Wang | Wei Chu
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Knowledge-driven conversation approaches have achieved remarkable research attention recently. However, generating an informative response with multiple relevant knowledge without losing fluency and coherence is still one of the main challenges. To address this issue, this paper proposes a method that uses recurrent knowledge interaction among response decoding steps to incorporate appropriate knowledge. Furthermore, we introduce a knowledge copy mechanism using a knowledge-aware pointer network to copy words from external knowledge according to knowledge attention distribution. Our joint neural conversation model which integrates recurrent Knowledge-Interaction and knowledge Copy (KIC) performs well on generating informative responses. Experiments demonstrate that our model with fewer parameters yields significant improvements over competitive baselines on two datasets Wizard-of-Wikipedia(average Bleu +87%; abs.: 0.034) and DuConv(average Bleu +20%; abs.: 0.047)) with different knowledge formats (textual & structured) and different languages (English & Chinese).

pdf bib
SpellGCN: Incorporating Phonological and Visual Similarities into Language Models for Chinese Spelling Check
Xingyi Cheng | Weidi Xu | Kunlong Chen | Shaohua Jiang | Feng Wang | Taifeng Wang | Wei Chu | Yuan Qi
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Chinese Spelling Check (CSC) is a task to detect and correct spelling errors in Chinese natural language. Existing methods have made attempts to incorporate the similarity knowledge between Chinese characters. However, they take the similarity knowledge as either an external input resource or just heuristic rules. This paper proposes to incorporate phonological and visual similarity knowledge into language models for CSC via a specialized graph convolutional network (SpellGCN). The model builds a graph over the characters, and SpellGCN is learned to map this graph into a set of inter-dependent character classifiers. These classifiers are applied to the representations extracted by another network, such as BERT, enabling the whole network to be end-to-end trainable. Experiments are conducted on three human-annotated datasets. Our method achieves superior performance against previous models by a large margin.

pdf bib
Question Directed Graph Attention Network for Numerical Reasoning over Text
Kunlong Chen | Weidi Xu | Xingyi Cheng | Zou Xiaochuan | Yuyu Zhang | Le Song | Taifeng Wang | Yuan Qi | Wei Chu
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Numerical reasoning over texts, such as addition, subtraction, sorting and counting, is a challenging machine reading comprehension task, since it requires both natural language understanding and arithmetic computation. To address this challenge, we propose a heterogeneous graph representation for the context of the passage and question needed for such reasoning, and design a question directed graph attention network to drive multi-step numerical reasoning over this context graph. Our model, which combines deep learning and graph reasoning, achieves remarkable results in benchmark datasets such as DROP.


pdf bib
Variational Semi-Supervised Aspect-Term Sentiment Analysis via Transformer
Xingyi Cheng | Weidi Xu | Taifeng Wang | Wei Chu | Weipeng Huang | Kunlong Chen | Junfeng Hu
Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

Aspect-term sentiment analysis (ATSA) is a long-standing challenge in natural language process. It requires fine-grained semantical reasoning about a target entity appeared in the text. As manual annotation over the aspects is laborious and time-consuming, the amount of labeled data is limited for supervised learning. This paper proposes a semi-supervised method for the ATSA problem by using the Variational Autoencoder based on Transformer. The model learns the latent distribution via variational inference. By disentangling the latent representation into the aspect-specific sentiment and the lexical context, our method induces the underlying sentiment prediction for the unlabeled data, which then benefits the ATSA classifier. Our method is classifier-agnostic, i.e., the classifier is an independent module and various supervised models can be integrated. Experimental results are obtained on the SemEval 2014 task 4 and show that our method is effective with different the five specific classifiers and outperforms these models by a significant margin.


pdf bib
AliMe Chat: A Sequence to Sequence and Rerank based Chatbot Engine
Minghui Qiu | Feng-Lin Li | Siyu Wang | Xing Gao | Yan Chen | Weipeng Zhao | Haiqing Chen | Jun Huang | Wei Chu
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

We propose AliMe Chat, an open-domain chatbot engine that integrates the joint results of Information Retrieval (IR) and Sequence to Sequence (Seq2Seq) based generation models. AliMe Chat uses an attentive Seq2Seq based rerank model to optimize the joint results. Extensive experiments show our engine outperforms both IR and generation based models. We launch AliMe Chat for a real-world industrial application and observe better results than another public chatbot.