Qingyu Yin


pdf bib
CERES: Pretraining of Graph-Conditioned Transformer for Semi-Structured Session Data
Rui Feng | Chen Luo | Qingyu Yin | Bing Yin | Tuo Zhao | Chao Zhang
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

User sessions empower many search and recommendation tasks on a daily basis. Such session data are semi-structured, which encode heterogeneous relations between queries and products, and each item is described by the unstructured text. Despite recent advances in self-supervised learning for text or graphs, there lack of self-supervised learning models that can effectively capture both intra-item semantics and inter-item interactions for semi-structured sessions. To fill this gap, we propose CERES, a graph-based transformer model for semi-structured session data. CERES learns representations that capture both inter- and intra-item semantics with (1) a graph-conditioned masked language pretraining task that jointly learns from item text and item-item relations; and (2) a graph-conditioned transformer architecture that propagates inter-item contexts to item-level representations. We pretrained CERES using ~468 million Amazon sessions and find that CERES outperforms strong pretraining baselines by up to 9% in three session search and entity linking tasks.

pdf bib
SEQZERO: Few-shot Compositional Semantic Parsing with Sequential Prompts and Zero-shot Models
Jingfeng Yang | Haoming Jiang | Qingyu Yin | Danqing Zhang | Bing Yin | Diyi Yang
Findings of the Association for Computational Linguistics: NAACL 2022

Recent research showed promising results on combining pretrained language models (LMs) with canonical utterance for few-shot semantic parsing.The canonical utterance is often lengthy and complex due to the compositional structure of formal languages. Learning to generate such canonical utterance requires significant amount of data to reach high performance. Fine-tuning with only few-shot samples, the LMs can easily forget pretrained knowledge, overfit spurious biases, and suffer from compositionally out-of-distribution generalization errors. To tackle these issues, we propose a novel few-shot semantic parsing method – SEQZERO. SEQZERO decomposes the problem into a sequence of sub-problems, which corresponds to the sub-clauses of the formal language. Based on the decomposition, the LMs only need to generate short answers using prompts for predicting sub-clauses. Thus, SEQZERO avoids generating a long canonical utterance at once. Moreover, SEQZERO employs not only a few-shot model but also a zero-shot model to alleviate the overfitting.In particular, SEQZERO brings out the merits from both models via ensemble equipped with our proposed constrained rescaling.SEQZERO achieves SOTA performance of BART-based models on GeoQuery and EcommerceQuery, which are two few-shot datasets with compositional data split.

pdf bib
Retrieval-Augmented Multilingual Keyphrase Generation with Retriever-Generator Iterative Training
Yifan Gao | Qingyu Yin | Zheng Li | Rui Meng | Tong Zhao | Bing Yin | Irwin King | Michael Lyu
Findings of the Association for Computational Linguistics: NAACL 2022

Keyphrase generation is the task of automatically predicting keyphrases given a piece of long text. Despite its recent flourishing, keyphrase generation on non-English languages haven’t been vastly investigated. In this paper, we call attention to a new setting named multilingual keyphrase generation and we contribute two new datasets, EcommerceMKP and AcademicMKP, covering six languages. Technically, we propose a retrieval-augmented method for multilingual keyphrase generation to mitigate the data shortage problem in non-English languages. The retrieval-augmented model leverages keyphrase annotations in English datasets to facilitate generating keyphrases in low-resource languages. Given a non-English passage, a cross-lingual dense passage retrieval module finds relevant English passages. Then the associated English keyphrases serve as external knowledge for keyphrase generation in the current language. Moreover, we develop a retriever-generator iterative training algorithm to mine pseudo parallel passage pairs to strengthen the cross-lingual passage retriever. Comprehensive experiments and ablations show that the proposed approach outperforms all baselines.

pdf bib
All Information is Valuable: Question Matching over Full Information Transmission Network
Le Qi | Yu Zhang | Qingyu Yin | Guidong Zheng | Wen Junjie | Jinlong Li | Ting Liu
Findings of the Association for Computational Linguistics: NAACL 2022

Question matching is the task of identifying whether two questions have the same intent. For better reasoning the relationship between questions, existing studies adopt multiple interaction modules and perform multi-round reasoning via deep neural networks. In this process, there are two kinds of critical information that are commonly employed: the representation information of original questions and the interactive information between pairs of questions. However, previous studies tend to transmit only one kind of information, while failing to utilize both kinds of information simultaneously. To address this problem, in this paper, we propose a Full Information Transmission Network (FITN) that can transmit both representation and interactive information together in a simultaneous fashion. More specifically, we employ a novel memory-based attention for keeping and transmitting the interactive information through a global interaction matrix. Besides, we apply an original-average mixed connection method to effectively transmit the representation information between different reasoning rounds, which helps to preserve the original representation features of questions along with the historical hidden features. Experiments on two standard benchmarks demonstrate that our approach outperforms strong baseline models.


pdf bib
Logic-level Evidence Retrieval and Graph-based Verification Network for Table-based Fact Verification
Qi Shi | Yu Zhang | Qingyu Yin | Ting Liu
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Table-based fact verification task aims to verify whether the given statement is supported by the given semi-structured table. Symbolic reasoning with logical operations plays a crucial role in this task. Existing methods leverage programs that contain rich logical information to enhance the verification process. However, due to the lack of fully supervised signals in the program generation process, spurious programs can be derived and employed, which leads to the inability of the model to catch helpful logical operations. To address the aforementioned problems, in this work, we formulate the table-based fact verification task as an evidence retrieval and reasoning framework, proposing the Logic-level Evidence Retrieval and Graph-based Verification network (LERGV). Specifically, we first retrieve logic-level program-like evidence from the given table and statement as supplementary evidence for the table. After that, we construct a logic-level graph to capture the logical relations between entities and functions in the retrieved evidence, and design a graph-based verification network to perform logic-level graph-based reasoning based on the constructed graph to classify the final entailment relation. Experimental results on the large-scale benchmark TABFACT show the effectiveness of the proposed approach.


pdf bib
Learn to Combine Linguistic and Symbolic Information for Table-based Fact Verification
Qi Shi | Yu Zhang | Qingyu Yin | Ting Liu
Proceedings of the 28th International Conference on Computational Linguistics

Table-based fact verification is expected to perform both linguistic reasoning and symbolic reasoning. Existing methods lack attention to take advantage of the combination of linguistic information and symbolic information. In this work, we propose HeterTFV, a graph-based reasoning approach, that learns to combine linguistic information and symbolic information effectively. We first construct a program graph to encode programs, a kind of LISP-like logical form, to learn the semantic compositionality of the programs. Then we construct a heterogeneous graph to incorporate both linguistic information and symbolic information by introducing program nodes into the heterogeneous graph. Finally, we propose a graph-based reasoning approach to reason over the multiple types of nodes to make an effective combination of both types of information. Experimental results on a large-scale benchmark dataset TABFACT illustrate the effect of our approach.


pdf bib
Towards Explainable NLP: A Generative Explanation Framework for Text Classification
Hui Liu | Qingyu Yin | William Yang Wang
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Building explainable systems is a critical problem in the field of Natural Language Processing (NLP), since most machine learning models provide no explanations for the predictions. Existing approaches for explainable machine learning systems tend to focus on interpreting the outputs or the connections between inputs and outputs. However, the fine-grained information (e.g. textual explanations for the labels) is often ignored, and the systems do not explicitly generate the human-readable explanations. To solve this problem, we propose a novel generative explanation framework that learns to make classification decisions and generate fine-grained explanations at the same time. More specifically, we introduce the explainable factor and the minimum risk training approach that learn to generate more reasonable explanations. We construct two new datasets that contain summaries, rating scores, and fine-grained reasons. We conduct experiments on both datasets, comparing with several strong neural network baseline systems. Experimental results show that our method surpasses all baselines on both datasets, and is able to generate concise explanations at the same time.


pdf bib
Zero Pronoun Resolution with Attention-based Neural Network
Qingyu Yin | Yu Zhang | Weinan Zhang | Ting Liu | William Yang Wang
Proceedings of the 27th International Conference on Computational Linguistics

Recent neural network methods for zero pronoun resolution explore multiple models for generating representation vectors for zero pronouns and their candidate antecedents. Typically, contextual information is utilized to encode the zero pronouns since they are simply gaps that contain no actual content. To better utilize contexts of the zero pronouns, we here introduce the self-attention mechanism for encoding zero pronouns. With the help of the multiple hops of attention, our model is able to focus on some informative parts of the associated texts and therefore produces an efficient way of encoding the zero pronouns. In addition, an attention-based recurrent neural network is proposed for encoding candidate antecedents by their contents. Experiment results are encouraging: our proposed attention-based model gains the best performance on the Chinese portion of the OntoNotes corpus, substantially surpasses existing Chinese zero pronoun resolution baseline systems.

pdf bib
Deep Reinforcement Learning for Chinese Zero Pronoun Resolution
Qingyu Yin | Yu Zhang | Wei-Nan Zhang | Ting Liu | William Yang Wang
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Recent neural network models for Chinese zero pronoun resolution gain great performance by capturing semantic information for zero pronouns and candidate antecedents, but tend to be short-sighted, operating solely by making local decisions. They typically predict coreference links between the zero pronoun and one single candidate antecedent at a time while ignoring their influence on future decisions. Ideally, modeling useful information of preceding potential antecedents is crucial for classifying later zero pronoun-candidate antecedent pairs, a need which leads traditional models of zero pronoun resolution to draw on reinforcement learning. In this paper, we show how to integrate these goals, applying deep reinforcement learning to deal with the task. With the help of the reinforcement learning agent, our system learns the policy of selecting antecedents in a sequential manner, where useful information provided by earlier predicted antecedents could be utilized for making later coreference decisions. Experimental results on OntoNotes 5.0 show that our approach substantially outperforms the state-of-the-art methods under three experimental settings.


pdf bib
Generating and Exploiting Large-scale Pseudo Training Data for Zero Pronoun Resolution
Ting Liu | Yiming Cui | Qingyu Yin | Wei-Nan Zhang | Shijin Wang | Guoping Hu
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Most existing approaches for zero pronoun resolution are heavily relying on annotated data, which is often released by shared task organizers. Therefore, the lack of annotated data becomes a major obstacle in the progress of zero pronoun resolution task. Also, it is expensive to spend manpower on labeling the data for better performance. To alleviate the problem above, in this paper, we propose a simple but novel approach to automatically generate large-scale pseudo training data for zero pronoun resolution. Furthermore, we successfully transfer the cloze-style reading comprehension neural network model into zero pronoun resolution task and propose a two-step training mechanism to overcome the gap between the pseudo training data and the real one. Experimental results show that the proposed approach significantly outperforms the state-of-the-art systems with an absolute improvements of 3.1% F-score on OntoNotes 5.0 data.

pdf bib
Chinese Zero Pronoun Resolution with Deep Memory Network
Qingyu Yin | Yu Zhang | Weinan Zhang | Ting Liu
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Existing approaches for Chinese zero pronoun resolution typically utilize only syntactical and lexical features while ignoring semantic information. The fundamental reason is that zero pronouns have no descriptive information, which brings difficulty in explicitly capturing their semantic similarities with antecedents. Meanwhile, representing zero pronouns is challenging since they are merely gaps that convey no actual content. In this paper, we address this issue by building a deep memory network that is capable of encoding zero pronouns into vector representations with information obtained from their contexts and potential antecedents. Consequently, our resolver takes advantage of semantic information by using these continuous distributed representations. Experiments on the OntoNotes 5.0 dataset show that the proposed memory network could substantially outperform the state-of-the-art systems in various experimental settings.