Xiaojun Quan


2021

pdf bib
Retrieve & Memorize: Dialog Policy Learning with Multi-Action Memory
YunHao Li | Yunyi Yang | Xiaojun Quan | Jianxing Yu
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib
Bi-Granularity Contrastive Learning for Post-Training in Few-Shot Scene
Ruikun Luo | Guanhuan Huang | Xiaojun Quan
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib
Learning to Answer Psychological Questionnaire for Personality Detection
Feifan Yang | Tao Yang | Xiaojun Quan | Qinliang Su
Findings of the Association for Computational Linguistics: EMNLP 2021

Existing text-based personality detection research mostly relies on data-driven approaches to implicitly capture personality cues in online posts, lacking the guidance of psychological knowledge. Psychological questionnaire, which contains a series of dedicated questions highly related to personality traits, plays a critical role in self-report personality assessment. We argue that the posts created by a user contain critical contents that could help answer the questions in a questionnaire, resulting in an assessment of his personality by linking the texts and the questionnaire. To this end, we propose a new model named Psychological Questionnaire enhanced Network (PQ-Net) to guide personality detection by tracking critical information in texts with a questionnaire. Specifically, PQ-Net contains two streams: a context stream to encode each piece of text into a contextual text representation, and a questionnaire stream to capture relevant information in the contextual text representation to generate potential answer representations for a questionnaire. The potential answer representations are used to enhance the contextual text representation and to benefit personality prediction. Experimental results on two datasets demonstrate the superiority of PQ-Net in capturing useful cues from the posts for personality detection.

pdf bib
Directed Acyclic Graph Network for Conversational Emotion Recognition
Weizhou Shen | Siyue Wu | Yunyi Yang | Xiaojun Quan
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

The modeling of conversational context plays a vital role in emotion recognition from conversation (ERC). In this paper, we put forward a novel idea of encoding the utterances with a directed acyclic graph (DAG) to better model the intrinsic structure within a conversation, and design a directed acyclic neural network, namely DAG-ERC, to implement this idea. In an attempt to combine the strengths of conventional graph-based neural models and recurrence-based neural models, DAG-ERC provides a more intuitive way to model the information flow between long-distance conversation background and nearby context. Extensive experiments are conducted on four ERC benchmarks with state-of-the-art models employed as baselines for comparison. The empirical results demonstrate the superiority of this new model and confirm the motivation of the directed acyclic graph architecture for ERC.

pdf bib
Psycholinguistic Tripartite Graph Network for Personality Detection
Tao Yang | Feifan Yang | Haolan Ouyang | Xiaojun Quan
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Most of the recent work on personality detection from online posts adopts multifarious deep neural networks to represent the posts and builds predictive models in a data-driven manner, without the exploitation of psycholinguistic knowledge that may unveil the connections between one’s language use and his psychological traits. In this paper, we propose a psycholinguistic knowledge-based tripartite graph network, TrigNet, which consists of a tripartite graph network and a BERT-based graph initializer. The graph network injects structural psycholinguistic knowledge in LIWC, a computerized instrument for psycholinguistic analysis, by constructing a heterogeneous tripartite graph. The initializer is employed to provide initial embeddings for the graph nodes. To reduce the computational cost in graph learning, we further propose a novel flow graph attention network (GAT) that only transmits messages between neighboring parties in the tripartite graph. Benefiting from the tripartite graph, TrigNet can aggregate post information from a psychological perspective, which is a novel way of exploiting domain knowledge. Extensive experiments on two datasets show that TrigNet outperforms the existing state-of-art model by 3.47 and 2.10 points in average F1. Moreover, the flow GAT reduces the FLOPS and Memory measures by 38% and 32%, respectively, in comparison to the original GAT in our setting.

pdf bib
Syntax-Enhanced Pre-trained Model
Zenan Xu | Daya Guo | Duyu Tang | Qinliang Su | Linjun Shou | Ming Gong | Wanjun Zhong | Xiaojun Quan | Daxin Jiang | Nan Duan
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

We study the problem of leveraging the syntactic structure of text to enhance pre-trained models such as BERT and RoBERTa. Existing methods utilize syntax of text either in the pre-training stage or in the fine-tuning stage, so that they suffer from discrepancy between the two stages. Such a problem would lead to the necessity of having human-annotated syntactic information, which limits the application of existing methods to broader scenarios. To address this, we present a model that utilizes the syntax of text in both pre-training and fine-tuning stages. Our model is based on Transformer with a syntax-aware attention layer that considers the dependency tree of the text. We further introduce a new pre-training task of predicting the syntactic distance among tokens in the dependency tree. We evaluate the model on three downstream tasks, including relation classification, entity typing, and question answering. Results show that our model achieves state-of-the-art performance on six public benchmark datasets. We have two major findings. First, we demonstrate that infusing automatically produced syntax of text improves pre-trained models. Second, global syntactic distances among tokens bring larger performance gains compared to local head relations between contiguous tokens.

2020

pdf bib
Constituency Lattice Encoding for Aspect Term Extraction
Yunyi Yang | Kun Li | Xiaojun Quan | Weizhou Shen | Qinliang Su
Proceedings of the 28th International Conference on Computational Linguistics

One of the remaining challenges for aspect term extraction in sentiment analysis resides in the extraction of phrase-level aspect terms, which is non-trivial to determine the boundaries of such terms. In this paper, we aim to address this issue by incorporating the span annotations of constituents of a sentence to leverage the syntactic information in neural network models. To this end, we first construct a constituency lattice structure based on the constituents of a constituency tree. Then, we present two approaches to encoding the constituency lattice using BiLSTM-CRF and BERT as the base models, respectively. We experimented on two benchmark datasets to evaluate the two models, and the results confirm their superiority with respective 3.17 and 1.35 points gained in F1-Measure over the current state of the art. The improvements justify the effectiveness of the constituency lattice for aspect term extraction.

pdf bib
Multi-choice Relational Reasoning for Machine Reading Comprehension
Wuya Chen | Xiaojun Quan | Chunyu Kit | Zhengcheng Min | Jiahai Wang
Proceedings of the 28th International Conference on Computational Linguistics

This paper presents our study of cloze-style reading comprehension by imitating human reading comprehension, which normally involves tactical comparing and reasoning over candidates while choosing the best answer. We propose a multi-choice relational reasoning (McR2) model with an aim to enable relational reasoning on candidates based on fusion representations of document, query and candidates. For the fusion representations, we develop an efficient encoding architecture by integrating the schemes of bidirectional attention flow, self-attention and document-gated query reading. Then, comparing and inferring over candidates are executed by a novel relational reasoning network. We conduct extensive experiments on four datasets derived from two public corpora, Children’s Book Test and Who DiD What, to verify the validity and advantages of our model. The results show that it outperforms all baseline models significantly on the four benchmark datasets. The effectiveness of its key components is also validated by an ablation study.

pdf bib
Embedding Dynamic Attributed Networks by Modeling the Evolution Processes
Zenan Xu | Zijing Ou | Qinliang Su | Jianxing Yu | Xiaojun Quan | ZhenKun Lin
Proceedings of the 28th International Conference on Computational Linguistics

Network embedding has recently emerged as a promising technique to embed nodes of a network into low-dimensional vectors. While fairly successful, most existing works focus on the embedding techniques for static networks. But in practice, there are many networks that are evolving over time and hence are dynamic, e.g., the social networks. To address this issue, a high-order spatio-temporal embedding model is developed to track the evolutions of dynamic networks. Specifically, an activeness-aware neighborhood embedding method is first proposed to extract the high-order neighborhood information at each given timestamp. Then, an embedding prediction framework is further developed to capture the temporal correlations, in which the attention mechanism is employed instead of recurrent neural networks (RNNs) for its efficiency in computing and flexibility in modeling. Extensive experiments are conducted on four real-world datasets from three different areas. It is shown that the proposed method outperforms all the baselines by a substantial margin for the tasks of dynamic link prediction and node classification, which demonstrates the effectiveness of the proposed methods on tracking the evolutions of dynamic networks.

pdf bib
Relational Graph Attention Network for Aspect-based Sentiment Analysis
Kai Wang | Weizhou Shen | Yunyi Yang | Xiaojun Quan | Rui Wang
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Aspect-based sentiment analysis aims to determine the sentiment polarity towards a specific aspect in online reviews. Most recent efforts adopt attention-based neural network models to implicitly connect aspects with opinion words. However, due to the complexity of language and the existence of multiple aspects in a single sentence, these models often confuse the connections. In this paper, we address this problem by means of effective encoding of syntax information. Firstly, we define a unified aspect-oriented dependency tree structure rooted at a target aspect by reshaping and pruning an ordinary dependency parse tree. Then, we propose a relational graph attention network (R-GAT) to encode the new tree structure for sentiment prediction. Extensive experiments are conducted on the SemEval 2014 and Twitter datasets, and the experimental results confirm that the connections between aspects and opinion words can be better established with our approach, and the performance of the graph attention network (GAT) is significantly improved as a consequence.

pdf bib
Low-Resource Generation of Multi-hop Reasoning Questions
Jianxing Yu | Wei Liu | Shuang Qiu | Qinliang Su | Kai Wang | Xiaojun Quan | Jian Yin
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

This paper focuses on generating multi-hop reasoning questions from the raw text in a low resource circumstance. Such questions have to be syntactically valid and need to logically correlate with the answers by deducing over multiple relations on several sentences in the text. Specifically, we first build a multi-hop generation model and guide it to satisfy the logical rationality by the reasoning chain extracted from a given text. Since the labeled data is limited and insufficient for training, we propose to learn the model with the help of a large scale of unlabeled data that is much easier to obtain. Such data contains rich expressive forms of the questions with structural patterns on syntax and semantics. These patterns can be estimated by the neural hidden semi-Markov model using latent variables. With latent patterns as a prior, we can regularize the generation model and produce the optimal results. Experimental results on the HotpotQA data set demonstrate the effectiveness of our model. Moreover, we apply the generated results to the task of machine reading comprehension and achieve significant performance improvements.

pdf bib
Conditional Augmentation for Aspect Term Extraction via Masked Sequence-to-Sequence Generation
Kun Li | Chengbo Chen | Xiaojun Quan | Qing Ling | Yan Song
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Aspect term extraction aims to extract aspect terms from review texts as opinion targets for sentiment analysis. One of the big challenges with this task is the lack of sufficient annotated data. While data augmentation is potentially an effective technique to address the above issue, it is uncontrollable as it may change aspect words and aspect labels unexpectedly. In this paper, we formulate the data augmentation as a conditional generation task: generating a new sentence while preserving the original opinion targets and labels. We propose a masked sequence-to-sequence method for conditional augmentation of aspect term extraction. Unlike existing augmentation approaches, ours is controllable and allows to generate more diversified sentences. Experimental results confirm that our method alleviates the data scarcity problem significantly. It also effectively boosts the performances of several current models for aspect term extraction.

pdf bib
Multi-Domain Dialogue Acts and Response Co-Generation
Kai Wang | Junfeng Tian | Rui Wang | Xiaojun Quan | Jianxing Yu
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Generating fluent and informative responses is of critical importance for task-oriented dialogue systems. Existing pipeline approaches generally predict multiple dialogue acts first and use them to assist response generation. There are at least two shortcomings with such approaches. First, the inherent structures of multi-domain dialogue acts are neglected. Second, the semantic associations between acts and responses are not taken into account for response generation. To address these issues, we propose a neural co-generation model that generates dialogue acts and responses concurrently. Unlike those pipeline approaches, our act generation module preserves the semantic structures of multi-domain dialogue acts and our response generation module dynamically attends to different acts as needed. We train the two modules jointly using an uncertainty loss to adjust their task weights adaptively. Extensive experiments are conducted on the large-scale MultiWOZ dataset and the results show that our model achieves very favorable improvement over several state-of-the-art models in both automatic and human evaluations.

pdf bib
Joint Chinese Word Segmentation and Part-of-speech Tagging via Two-way Attentions of Auto-analyzed Knowledge
Yuanhe Tian | Yan Song | Xiang Ao | Fei Xia | Xiaojun Quan | Tong Zhang | Yonggang Wang
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Chinese word segmentation (CWS) and part-of-speech (POS) tagging are important fundamental tasks for Chinese language processing, where joint learning of them is an effective one-step solution for both tasks. Previous studies for joint CWS and POS tagging mainly follow the character-based tagging paradigm with introducing contextual information such as n-gram features or sentential representations from recurrent neural models. However, for many cases, the joint tagging needs not only modeling from context features but also knowledge attached to them (e.g., syntactic relations among words); limited efforts have been made by existing research to meet such needs. In this paper, we propose a neural model named TwASP for joint CWS and POS tagging following the character-based sequence labeling paradigm, where a two-way attention mechanism is used to incorporate both context feature and their corresponding syntactic knowledge for each input character. Particularly, we use existing language processing toolkits to obtain the auto-analyzed syntactic knowledge for the context, and the proposed attention module can learn and benefit from them although their quality may not be perfect. Our experiments illustrate the effectiveness of the two-way attentions for joint CWS and POS tagging, where state-of-the-art performance is achieved on five benchmark datasets.

2019

pdf bib
A Deep Neural Information Fusion Architecture for Textual Network Embeddings
Zenan Xu | Qinliang Su | Xiaojun Quan | Weijia Zhang
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Textual network embeddings aim to learn a low-dimensional representation for every node in the network so that both the structural and textual information from the networks can be well preserved in the representations. Traditionally, the structural and textual embeddings were learned by models that rarely take the mutual influences between them into account. In this paper, a deep neural architecture is proposed to effectively fuse the two kinds of informations into one representation. The novelties of the proposed architecture are manifested in the aspects of a newly defined objective function, the complementary information fusion method for structural and textual features, and the mutual gate mechanism for textual feature extraction. Experimental results show that the proposed model outperforms the comparing methods on all three datasets.

pdf bib
BiSET: Bi-directional Selective Encoding with Template for Abstractive Summarization
Kai Wang | Xiaojun Quan | Rui Wang
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

The success of neural summarization models stems from the meticulous encodings of source articles. To overcome the impediments of limited and sometimes noisy training data, one promising direction is to make better use of the available training data by applying filters during summarization. In this paper, we propose a novel Bi-directional Selective Encoding with Template (BiSET) model, which leverages template discovered from training data to softly select key information from each source article to guide its summarization process. Extensive experiments on a standard summarization dataset are conducted and the results show that the template-equipped BiSET model manages to improve the summarization performance significantly with a new state of the art.

2013

pdf bib
Non-Monotonic Sentence Alignment via Semisupervised Learning
Xiaojun Quan | Chunyu Kit | Yan Song
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)