Kai Wang


pdf bib
Instance-Guided Prompt Learning for Few-Shot Text Matching
Jia Du | Xuanyu Zhang | Siyi Wang | Kai Wang | Yanquan Zhou | Lei Li | Qing Yang | Dongliang Xu
Findings of the Association for Computational Linguistics: EMNLP 2022

Few-shot text matching is a more practical technique in natural language processing (NLP) to determine whether two texts are semantically identical. They primarily design patterns to reformulate text matching into a pre-trained task with uniform prompts across all instances. But they fail to take into account the connection between prompts and instances. This paper argues that dynamically strengthening the correlation between particular instances and the prompts is necessary because fixed prompts cannot adequately fit all diverse instances in inference. We suggest IGATE: Instance-Guided prompt leArning for few-shoT tExt matching, a novel pluggable prompt learning method. The gate mechanism used by IGATE, which is between the embedding and the PLM encoders, makes use of the semantics of instances to regulate the effects of the gate on the prompt tokens. The experimental findings show that IGATE achieves SOTA performance on MRPC and QQP, outperforming strong baselines. GitHub will host the release of codes.


pdf bib
Hyperbolic Geometry is Not Necessary: Lightweight Euclidean-Based Models for Low-Dimensional Knowledge Graph Embeddings
Kai Wang | Yu Liu | Dan Lin | Michael Sheng
Findings of the Association for Computational Linguistics: EMNLP 2021

Recent knowledge graph embedding (KGE) models based on hyperbolic geometry have shown great potential in a low-dimensional embedding space. However, the necessity of hyperbolic space in KGE is still questionable, because the calculation based on hyperbolic geometry is much more complicated than Euclidean operations. In this paper, based on the state-of-the-art hyperbolic-based model RotH, we develop two lightweight Euclidean-based models, called RotL and Rot2L. The RotL model simplifies the hyperbolic operations while keeping the flexible normalization effect. Utilizing a novel two-layer stacked transformation and based on RotL, the Rot2L model obtains an improved representation capability, yet costs fewer parameters and calculations than RotH. The experiments on link prediction show that Rot2L achieves the state-of-the-art performance on two widely-used datasets in low-dimensional knowledge graph embeddings. Furthermore, RotL achieves similar performance as RotH but only requires half of the training time.


pdf bib
Relational Graph Attention Network for Aspect-based Sentiment Analysis
Kai Wang | Weizhou Shen | Yunyi Yang | Xiaojun Quan | Rui Wang
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Aspect-based sentiment analysis aims to determine the sentiment polarity towards a specific aspect in online reviews. Most recent efforts adopt attention-based neural network models to implicitly connect aspects with opinion words. However, due to the complexity of language and the existence of multiple aspects in a single sentence, these models often confuse the connections. In this paper, we address this problem by means of effective encoding of syntax information. Firstly, we define a unified aspect-oriented dependency tree structure rooted at a target aspect by reshaping and pruning an ordinary dependency parse tree. Then, we propose a relational graph attention network (R-GAT) to encode the new tree structure for sentiment prediction. Extensive experiments are conducted on the SemEval 2014 and Twitter datasets, and the experimental results confirm that the connections between aspects and opinion words can be better established with our approach, and the performance of the graph attention network (GAT) is significantly improved as a consequence.

pdf bib
Low-Resource Generation of Multi-hop Reasoning Questions
Jianxing Yu | Wei Liu | Shuang Qiu | Qinliang Su | Kai Wang | Xiaojun Quan | Jian Yin
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

This paper focuses on generating multi-hop reasoning questions from the raw text in a low resource circumstance. Such questions have to be syntactically valid and need to logically correlate with the answers by deducing over multiple relations on several sentences in the text. Specifically, we first build a multi-hop generation model and guide it to satisfy the logical rationality by the reasoning chain extracted from a given text. Since the labeled data is limited and insufficient for training, we propose to learn the model with the help of a large scale of unlabeled data that is much easier to obtain. Such data contains rich expressive forms of the questions with structural patterns on syntax and semantics. These patterns can be estimated by the neural hidden semi-Markov model using latent variables. With latent patterns as a prior, we can regularize the generation model and produce the optimal results. Experimental results on the HotpotQA data set demonstrate the effectiveness of our model. Moreover, we apply the generated results to the task of machine reading comprehension and achieve significant performance improvements.

pdf bib
Multi-Domain Dialogue Acts and Response Co-Generation
Kai Wang | Junfeng Tian | Rui Wang | Xiaojun Quan | Jianxing Yu
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Generating fluent and informative responses is of critical importance for task-oriented dialogue systems. Existing pipeline approaches generally predict multiple dialogue acts first and use them to assist response generation. There are at least two shortcomings with such approaches. First, the inherent structures of multi-domain dialogue acts are neglected. Second, the semantic associations between acts and responses are not taken into account for response generation. To address these issues, we propose a neural co-generation model that generates dialogue acts and responses concurrently. Unlike those pipeline approaches, our act generation module preserves the semantic structures of multi-domain dialogue acts and our response generation module dynamically attends to different acts as needed. We train the two modules jointly using an uncertainty loss to adjust their task weights adaptively. Extensive experiments are conducted on the large-scale MultiWOZ dataset and the results show that our model achieves very favorable improvement over several state-of-the-art models in both automatic and human evaluations.

pdf bib
基于图神经网络的汉语依存分析和语义组合计算联合模型(Joint Learning Chinese Dependency Parsing and Semantic Composition based on Graph Neural Network)
Kai Wang (汪凯) | Mingtong Liu (刘明童) | Yuanmeng Chen (陈圆梦) | Yujie Zhang (张玉洁) | Jinan Xu (徐金安) | Yufeng Chen (陈钰枫)
Proceedings of the 19th Chinese National Conference on Computational Linguistics

组合原则表明句子的语义由其构成成分的语义按照一定规则组合而成, 由此基于句法结构的语义组合计算一直是一个重要的探索方向,其中采用树结构的组合计算方法最具有代表性。但是该方法难以应用于大规模数据处理,主要问题是其语义组合的顺序依赖于具体树的结构,无法实现并行处理。本文提出一种基于图的依存句法分析和语义组合计算的联合框架,并借助复述识别任务训练语义组合模型和句法分析模型。一方面图模型可以在训练和预测阶段采用并行处理,极大缩短计算时间;另一方面联合句法分析的语义组合框架不必依赖外部句法分析器,同时两个任务的联合学习可使语义表示同时学习句法结构和语义的上下文信息。我们在公开汉语复述识别数据集LCQMC上进行评测,实验结果显示准确率接近树结构组合方法,达到79.54%,而预测速度提升高达30倍。


pdf bib
BiSET: Bi-directional Selective Encoding with Template for Abstractive Summarization
Kai Wang | Xiaojun Quan | Rui Wang
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

The success of neural summarization models stems from the meticulous encodings of source articles. To overcome the impediments of limited and sometimes noisy training data, one promising direction is to make better use of the available training data by applying filters during summarization. In this paper, we propose a novel Bi-directional Selective Encoding with Template (BiSET) model, which leverages template discovered from training data to softly select key information from each source article to guide its summarization process. Extensive experiments on a standard summarization dataset are conducted and the results show that the template-equipped BiSET model manages to improve the summarization performance significantly with a new state of the art.


pdf bib
LIUM-CVC Submissions for WMT18 Multimodal Translation Task
Ozan Caglayan | Adrien Bardet | Fethi Bougares | Loïc Barrault | Kai Wang | Marc Masana | Luis Herranz | Joost van de Weijer
Proceedings of the Third Conference on Machine Translation: Shared Task Papers

This paper describes the multimodal Neural Machine Translation systems developed by LIUM and CVC for WMT18 Shared Task on Multimodal Translation. This year we propose several modifications to our previous multimodal attention architecture in order to better integrate convolutional features and refine them using encoder-side information. Our final constrained submissions ranked first for English→French and second for English→German language pairs among the constrained submissions according to the automatic evaluation metric METEOR.


pdf bib
Domain-Assisted Product Aspect Hierarchy Generation: Towards Hierarchical Organization of Unstructured Consumer Reviews
Jianxing Yu | Zheng-Jun Zha | Meng Wang | Kai Wang | Tat-Seng Chua
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing


pdf bib
Exploiting Salient Patterns for Question Detection and Question Retrieval in Community-based Question Answering
Kai Wang | Tat-Seng Chua
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)