Sujian Li


2021

pdf bib
Do It Once: An Embarrassingly Simple Joint Matching Approach to Response Selection
Linhao Zhang | Dehong Ma | Sujian Li | Houfeng Wang
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib
Cross-Lingual Leveled Reading Based on Language-Invariant Features
Simin Rao | Hua Zheng | Sujian Li
Findings of the Association for Computational Linguistics: EMNLP 2021

Leveled reading (LR) aims to automatically classify texts by the cognitive levels of readers, which is fundamental in providing appropriate reading materials regarding different reading capabilities. However, most state-of-the-art LR methods rely on the availability of copious annotated resources, which prevents their adaptation to low-resource languages like Chinese. In our work, to tackle LR in Chinese, we explore how different language transfer methods perform on English-Chinese LR. Specifically, we focus on adversarial training and cross-lingual pre-training method to transfer the LR knowledge learned from annotated data in the resource-rich English language to Chinese. For evaluation, we first introduce the age-based standard to align datasets with different leveling standards. Then we conduct experiments in both zero-shot and few-shot settings. Comparing these two methods, quantitative and qualitative evaluations show that the cross-lingual pre-training method effectively captures the language-invariant features between English and Chinese. We conduct analysis to propose further improvement in cross-lingual LR.

pdf bib
Semi-Automatic Construction of Text-to-SQL Data for Domain Transfer
Tianyi Li | Sujian Li | Mark Steedman
Proceedings of the 17th International Conference on Parsing Technologies and the IWPT 2021 Shared Task on Parsing into Enhanced Universal Dependencies (IWPT 2021)

Strong and affordable in-domain data is a desirable asset when transferring trained semantic parsers to novel domains. As previous methods for semi-automatically constructing such data cannot handle the complexity of realistic SQL queries, we propose to construct SQL queries via context-dependent sampling, and introduce the concept of topic. Along with our SQL query construction method, we propose a novel pipeline of semi-automatic Text-to-SQL dataset construction that covers the broad space of SQL queries. We show that the created dataset is comparable with expert annotation along multiple dimensions, and is capable of improving domain transfer performance for SOTA semantic parsers.

pdf bib
阅读分级相关研究综述(A Survey of Leveled Reading)
Simin Rao (饶思敏) | Hua Zheng (郑婳) | Sujian Li (李素建)
Proceedings of the 20th Chinese National Conference on Computational Linguistics

“阅读分级的概念在二十世纪早期就被教育工作者提出,随着人们对阅读变得越来越重视,阅读分级引起了越来越多的关注,自动阅读分级技术也得到了一定程度的发展。本文总结了近年来的阅读分级领域的研究进展,首先介绍了阅读分级现有的标准和随之而产生的各种体系和语料资源。在此基础之上整理了在自动阅读分级工作已经广泛应用的三类方法:公式法、传统的机器学习方法和最近热门的深度学习方法,并结合实验结果梳理了三类方法存在的弊利,以及可以改进的方向。最后本文还对阅读分级的未来发展方向以及可以应用的领域进行了总结和展望。”

pdf bib
Guiding the Growth: Difficulty-Controllable Question Generation through Step-by-Step Rewriting
Yi Cheng | Siyao Li | Bang Liu | Ruihui Zhao | Sujian Li | Chenghua Lin | Yefeng Zheng
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

This paper explores the task of Difficulty-Controllable Question Generation (DCQG), which aims at generating questions with required difficulty levels. Previous research on this task mainly defines the difficulty of a question as whether it can be correctly answered by a Question Answering (QA) system, lacking interpretability and controllability. In our work, we redefine question difficulty as the number of inference steps required to answer it and argue that Question Generation (QG) systems should have stronger control over the logic of generated questions. To this end, we propose a novel framework that progressively increases question difficulty through step-by-step rewriting under the guidance of an extracted reasoning chain. A dataset is automatically constructed to facilitate the research, on which extensive experiments are conducted to test the performance of our method.

pdf bib
BASS: Boosting Abstractive Summarization with Unified Semantic Graph
Wenhao Wu | Wei Li | Xinyan Xiao | Jiachen Liu | Ziqiang Cao | Sujian Li | Hua Wu | Haifeng Wang
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Abstractive summarization for long-document or multi-document remains challenging for the Seq2Seq architecture, as Seq2Seq is not good at analyzing long-distance relations in text. In this paper, we present BASS, a novel framework for Boosting Abstractive Summarization based on a unified Semantic graph, which aggregates co-referent phrases distributing across a long range of context and conveys rich relations between phrases. Further, a graph-based encoder-decoder model is proposed to improve both the document representation and summary generation process by leveraging the graph structure. Specifically, several graph augmentation methods are designed to encode both the explicit and implicit relations in the text while the graph-propagation attention mechanism is developed in the decoder to select salient content into the summary. Empirical results show that the proposed architecture brings substantial improvements for both long-document and multi-document summarization tasks.

2020

pdf bib
Composing Elementary Discourse Units in Abstractive Summarization
Zhenwen Li | Wenhao Wu | Sujian Li
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

In this paper, we argue that elementary discourse unit (EDU) is a more appropriate textual unit of content selection than the sentence unit in abstractive summarization. To well handle the problem of composing EDUs into an informative and fluent summary, we propose a novel summarization method that first designs an EDU selection model to extract and group informative EDUs and then an EDU fusion model to fuse the EDUs in each group into one sentence. We also design the reinforcement learning mechanism to use EDU fusion results to reward the EDU selection action, boosting the final summarization performance. Experiments on CNN/Daily Mail have demonstrated the effectiveness of our model.

pdf bib
Proceedings of the 19th Chinese National Conference on Computational Linguistics
Maosong Sun (孙茂松) | Sujian Li (李素建) | Yue Zhang (张岳) | Yang Liu (刘洋)
Proceedings of the 19th Chinese National Conference on Computational Linguistics

pdf bib
Refining Data for Text Generation
Wenyu Guan | Qianying Liu | Tianyi Li | Sujian Li
Proceedings of the 19th Chinese National Conference on Computational Linguistics

Recent work on data-to-text generation has made progress under the neural encoder-decoder architectures. However, the data input size is often enormous, while not all data records are important for text generation and inappropriate input may bring noise into the final output. To solve this problem, we propose a two-step approach which first selects and orders the important data records and then generates text from the noise-reduced data. Here we propose a learning to rank model to rank the importance of each record which is supervised by a relation extractor. With the noise-reduced data as input, we implement a text generator which sequentially models the input data records and emits a summary. Experiments on the ROTOWIRE dataset verifies the effectiveness of our proposed method in both performance and efficiency.

pdf bib
Syntax-Aware Graph Attention Network for Aspect-Level Sentiment Classification
Lianzhe Huang | Xin Sun | Sujian Li | Linhao Zhang | Houfeng Wang
Proceedings of the 28th International Conference on Computational Linguistics

Aspect-level sentiment classification aims to distinguish the sentiment polarities over aspect terms in a sentence. Existing approaches mostly focus on modeling the relationship between the given aspect words and their contexts with attention, and ignore the use of more elaborate knowledge implicit in the context. In this paper, we exploit syntactic awareness to the model by the graph attention network on the dependency tree structure and external pre-training knowledge by BERT language model, which helps to model the interaction between the context and aspect words better. And the subwords of BERT are integrated into the dependency tree graphs, which can obtain more accurate representations of words by graph attention. Experiments demonstrate the effectiveness of our model.

pdf bib
Research on Discourse Parsing: from the Dependency View
Sujian Li
Proceedings of the Second International Workshop of Discourse Processing

Discourse parsing aims to comprehensively acquire the logical structure of the whole text which may be helpful to some downstream applications such as summarization, reading comprehension, QA and so on. One important issue behind discourse parsing is the representation of discourse structure. Up to now, many discourse structures have been proposed, and the correponding parsing methods are designed, promoting the development of discourse research. In this paper, we mainly introduce our recent discourse research and its preliminary application from the dependency view.

pdf bib
Evaluating Text Coherence at Sentence and Paragraph Levels
Sennan Liu | Shuang Zeng | Sujian Li
Proceedings of the 12th Language Resources and Evaluation Conference

In this paper, to evaluate text coherence, we propose the paragraph ordering task as well as conducting sentence ordering. We collected four distinct corpora from different domains on which we investigate the adaptation of existing sentence ordering methods to a paragraph ordering task. We also compare the learnability and robustness of existing models by artificially creating mini datasets and noisy datasets respectively and verifying the efficiency of established models under these circumstances. Furthermore, we carry out human evaluation on the rearranged passages from two competitive models and confirm that WLCS-l is a better metric performing significantly higher correlations with human rating than τ , the most prevalent metric used before. Results from these evaluations show that except for certain extreme conditions, the recurrent graph neural network-based model is an optimal choice for coherence modeling.

2019

pdf bib
Tree-structured Decoding for Solving Math Word Problems
Qianying Liu | Wenyv Guan | Sujian Li | Daisuke Kawahara
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Automatically solving math word problems is an interesting research topic that needs to bridge natural language descriptions and formal math equations. Previous studies introduced end-to-end neural network methods, but these approaches did not efficiently consider an important characteristic of the equation, i.e., an abstract syntax tree. To address this problem, we propose a tree-structured decoding method that generates the abstract syntax tree of the equation in a top-down manner. In addition, our approach can automatically stop during decoding without a redundant stop token. The experimental results show that our method achieves single model state-of-the-art performance on Math23K, which is the largest dataset on this task.

pdf bib
Text Level Graph Neural Network for Text Classification
Lianzhe Huang | Dehong Ma | Sujian Li | Xiaodong Zhang | Houfeng Wang
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Recently, researches have explored the graph neural network (GNN) techniques on text classification, since GNN does well in handling complex structures and preserving global information. However, previous methods based on GNN are mainly faced with the practical problems of fixed corpus level graph structure which don’t support online testing and high memory consumption. To tackle the problems, we propose a new GNN based model that builds graphs for each input text with global parameters sharing instead of a single graph for the whole corpus. This method removes the burden of dependence between an individual text and entire corpus which support online testing, but still preserve global information. Besides, we build graphs by much smaller windows in the text, which not only extract more local features but also significantly reduce the edge numbers as well as memory consumption. Experiments show that our model outperforms existing models on several text classification datasets even with consuming less memory.

pdf bib
Denoising based Sequence-to-Sequence Pre-training for Text Generation
Liang Wang | Wei Zhao | Ruoyu Jia | Sujian Li | Jingming Liu
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

This paper presents a new sequence-to-sequence (seq2seq) pre-training method PoDA (Pre-training of Denoising Autoencoders), which learns representations suitable for text generation tasks. Unlike encoder-only (e.g., BERT) or decoder-only (e.g., OpenAI GPT) pre-training approaches, PoDA jointly pre-trains both the encoder and decoder by denoising the noise-corrupted text, and it also has the advantage of keeping the network architecture unchanged in the subsequent fine-tuning stage. Meanwhile, we design a hybrid model of Transformer and pointer-generator networks as the backbone architecture for PoDA. We conduct experiments on two text generation tasks: abstractive summarization, and grammatical error correction. Results on four datasets show that PoDA can improve model performance over strong baselines without using any task-specific techniques and significantly speed up convergence.

pdf bib
Do NLP Models Know Numbers? Probing Numeracy in Embeddings
Eric Wallace | Yizhong Wang | Sujian Li | Sameer Singh | Matt Gardner
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

The ability to understand and work with numbers (numeracy) is critical for many complex reasoning tasks. Currently, most NLP models treat numbers in text in the same way as other tokens—they embed them as distributed vectors. Is this enough to capture numeracy? We begin by investigating the numerical reasoning capabilities of a state-of-the-art question answering model on the DROP dataset. We find this model excels on questions that require numerical reasoning, i.e., it already captures numeracy. To understand how this capability emerges, we probe token embedding methods (e.g., BERT, GloVe) on synthetic list maximum, number decoding, and addition tasks. A surprising degree of numeracy is naturally present in standard embeddings. For example, GloVe and word2vec accurately encode magnitude for numbers up to 1,000. Furthermore, character-level embeddings are even more precise—ELMo captures numeracy the best for all pre-trained methods—but BERT, which uses sub-word units, is less exact.

pdf bib
An Improved Coarse-to-Fine Method for Solving Generation Tasks
Wenyv Guan | Qianying Liu | Guangzhi Han | Bin Wang | Sujian Li
Proceedings of the The 17th Annual Workshop of the Australasian Language Technology Association

The coarse-to-fine (coarse2fine) methods have recently been widely used in the generation tasks. The methods first generate a rough sketch in the coarse stage and then use the sketch to get the final result in the fine stage. However, they usually lack the correction ability when getting a wrong sketch. To solve this problem, in this paper, we propose an improved coarse2fine model with a control mechanism, with which our method can control the influence of the sketch on the final results in the fine stage. Even if the sketch is wrong, our model still has the opportunity to get a correct result. We have experimented our model on the tasks of semantic parsing and math word problem solving. The results have shown the effectiveness of our proposed model.

pdf bib
Enhancing Pre-Trained Language Representations with Rich Knowledge for Machine Reading Comprehension
An Yang | Quan Wang | Jing Liu | Kai Liu | Yajuan Lyu | Hua Wu | Qiaoqiao She | Sujian Li
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Machine reading comprehension (MRC) is a crucial and challenging task in NLP. Recently, pre-trained language models (LMs), especially BERT, have achieved remarkable success, presenting new state-of-the-art results in MRC. In this work, we investigate the potential of leveraging external knowledge bases (KBs) to further improve BERT for MRC. We introduce KT-NET, which employs an attention mechanism to adaptively select desired knowledge from KBs, and then fuses selected knowledge with BERT to enable context- and knowledge-aware predictions. We believe this would combine the merits of both deep LMs and curated KBs towards better MRC. Experimental results indicate that KT-NET offers significant and consistent improvements over BERT, outperforming competitive baselines on ReCoRD and SQuAD1.1 benchmarks. Notably, it ranks the 1st place on the ReCoRD leaderboard, and is also the best single model on the SQuAD1.1 leaderboard at the time of submission (March 4th, 2019).

pdf bib
Exploring Sequence-to-Sequence Learning in Aspect Term Extraction
Dehong Ma | Sujian Li | Fangzhao Wu | Xing Xie | Houfeng Wang
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Aspect term extraction (ATE) aims at identifying all aspect terms in a sentence and is usually modeled as a sequence labeling problem. However, sequence labeling based methods cannot make full use of the overall meaning of the whole sentence and have the limitation in processing dependencies between labels. To tackle these problems, we first explore to formalize ATE as a sequence-to-sequence (Seq2Seq) learning task where the source sequence and target sequence are composed of words and labels respectively. At the same time, to make Seq2Seq learning suit to ATE where labels correspond to words one by one, we design the gated unit networks to incorporate corresponding word representation into the decoder, and position-aware attention to pay more attention to the adjacent words of a target word. The experimental results on two datasets show that Seq2Seq learning is effective in ATE accompanied with our proposed gated unit networks and position-aware attention mechanism.

pdf bib
Incorporating Textual Evidence in Visual Storytelling
Tianyi Li | Sujian Li
Proceedings of the 1st Workshop on Discourse Structure in Neural NLG

Previous work on visual storytelling mainly focused on exploring image sequence as evidence for storytelling and neglected textual evidence for guiding story generation. Motivated by human storytelling process which recalls stories for familiar images, we exploit textual evidence from similar images to help generate coherent and meaningful stories. To pick the images which may provide textual experience, we propose a two-step ranking method based on image object recognition techniques. To utilize textual information, we design an extended Seq2Seq model with two-channel encoder and attention. Experiments on the VIST dataset show that our method outperforms state-of-the-art baseline models without heavy engineering.

pdf bib
Zero-shot Chinese Discourse Dependency Parsing via Cross-lingual Mapping
Yi Cheng | Sujian Li
Proceedings of the 1st Workshop on Discourse Structure in Neural NLG

Due to the absence of labeled data, discourse parsing still remains challenging in some languages. In this paper, we present a simple and efficient method to conduct zero-shot Chinese text-level dependency parsing by leveraging English discourse labeled data and parsing techniques. We first construct the Chinese-English mapping from the level of sentence and elementary discourse unit (EDU), and then exploit the parsing results of the corresponding English translations to obtain the discourse trees for the Chinese text. This method can automatically conduct Chinese discourse parsing, with no need of a large scale of Chinese labeled data.

2018

pdf bib
Auto-Dialabel: Labeling Dialogue Data with Unsupervised Learning
Chen Shi | Qi Chen | Lei Sha | Sujian Li | Xu Sun | Houfeng Wang | Lintao Zhang
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

The lack of labeled data is one of the main challenges when building a task-oriented dialogue system. Existing dialogue datasets usually rely on human labeling, which is expensive, limited in size, and in low coverage. In this paper, we instead propose our framework auto-dialabel to automatically cluster the dialogue intents and slots. In this framework, we collect a set of context features, leverage an autoencoder for feature assembly, and adapt a dynamic hierarchical clustering method for intent and slot labeling. Experimental results show that our framework can promote human labeling cost to a great extent, achieve good intent clustering accuracy (84.1%), and provide reasonable and instructive slot labeling results.

pdf bib
Toward Fast and Accurate Neural Discourse Segmentation
Yizhong Wang | Sujian Li | Jingfeng Yang
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Discourse segmentation, which segments texts into Elementary Discourse Units, is a fundamental step in discourse analysis. Previous discourse segmenters rely on complicated hand-crafted features and are not practical in actual use. In this paper, we propose an end-to-end neural segmenter based on BiLSTM-CRF framework. To improve its accuracy, we address the problem of data insufficiency by transferring a word representation model that is trained on a large corpus. We also propose a restricted self-attention mechanism in order to capture useful information within a neighborhood. Experiments on the RST-DT corpus show that our model is significantly faster than previous methods, while achieving new state-of-the-art performance.

pdf bib
Joint Learning for Targeted Sentiment Analysis
Dehong Ma | Sujian Li | Houfeng Wang
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Targeted sentiment analysis (TSA) aims at extracting targets and classifying their sentiment classes. Previous works only exploit word embeddings as features and do not explore more potentials of neural networks when jointly learning the two tasks. In this paper, we carefully design the hierarchical stack bidirectional gated recurrent units (HSBi-GRU) model to learn abstract features for both tasks, and we propose a HSBi-GRU based joint model which allows the target label to have influence on their sentiment label. Experimental results on two datasets show that our joint learning model can outperform other baselines and demonstrate the effectiveness of HSBi-GRU in learning abstract features.

pdf bib
Retrieve, Rerank and Rewrite: Soft Template Based Neural Summarization
Ziqiang Cao | Wenjie Li | Sujian Li | Furu Wei
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Most previous seq2seq summarization systems purely depend on the source text to generate summaries, which tends to work unstably. Inspired by the traditional template-based summarization approaches, this paper proposes to use existing summaries as soft templates to guide the seq2seq model. To this end, we use a popular IR platform to Retrieve proper summaries as candidate templates. Then, we extend the seq2seq framework to jointly conduct template Reranking and template-aware summary generation (Rewriting). Experiments show that, in terms of informativeness, our model significantly outperforms the state-of-the-art methods, and even soft templates themselves demonstrate high competitiveness. In addition, the import of high-quality external summaries improves the stability and readability of generated summaries.

pdf bib
Multi-Passage Machine Reading Comprehension with Cross-Passage Answer Verification
Yizhong Wang | Kai Liu | Jing Liu | Wei He | Yajuan Lyu | Hua Wu | Sujian Li | Haifeng Wang
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Machine reading comprehension (MRC) on real web data usually requires the machine to answer a question by analyzing multiple passages retrieved by search engine. Compared with MRC on a single passage, multi-passage MRC is more challenging, since we are likely to get multiple confusing answer candidates from different passages. To address this problem, we propose an end-to-end neural model that enables those answer candidates from different passages to verify each other based on their content representations. Specifically, we jointly train three modules that can predict the final answer based on three factors: the answer boundary, the answer content and the cross-passage answer verification. The experimental results show that our method outperforms the baseline by a large margin and achieves the state-of-the-art performance on the English MS-MARCO dataset and the Chinese DuReader dataset, both of which are designed for MRC in real-world settings.

pdf bib
SciDTB: Discourse Dependency TreeBank for Scientific Abstracts
An Yang | Sujian Li
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Annotation corpus for discourse relations benefits NLP tasks such as machine translation and question answering. In this paper, we present SciDTB, a domain-specific discourse treebank annotated on scientific articles. Different from widely-used RST-DT and PDTB, SciDTB uses dependency trees to represent discourse structure, which is flexible and simplified to some extent but do not sacrifice structural integrity. We discuss the labeling framework, annotation workflow and some statistics about SciDTB. Furthermore, our treebank is made as a benchmark for evaluating discourse dependency parsers, on which we provide several baselines as fundamental work.

pdf bib
Adaptations of ROUGE and BLEU to Better Evaluate Machine Reading Comprehension Task
An Yang | Kai Liu | Jing Liu | Yajuan Lyu | Sujian Li
Proceedings of the Workshop on Machine Reading for Question Answering

Current evaluation metrics to question answering based machine reading comprehension (MRC) systems generally focus on the lexical overlap between candidate and reference answers, such as ROUGE and BLEU. However, bias may appear when these metrics are used for specific question types, especially questions inquiring yes-no opinions and entity lists. In this paper, we make adaptations on the metrics to better correlate n-gram overlap with the human judgment for answers to these two question types. Statistical analysis proves the effectiveness of our approach. Our adaptations may provide positive guidance for the development of real-scene MRC systems.

pdf bib
Multi-Perspective Context Aggregation for Semi-supervised Cloze-style Reading Comprehension
Liang Wang | Sujian Li | Wei Zhao | Kewei Shen | Meng Sun | Ruoyu Jia | Jingming Liu
Proceedings of the 27th International Conference on Computational Linguistics

Cloze-style reading comprehension has been a popular task for measuring the progress of natural language understanding in recent years. In this paper, we design a novel multi-perspective framework, which can be seen as the joint training of heterogeneous experts and aggregate context information from different perspectives. Each perspective is modeled by a simple aggregation module. The outputs of multiple aggregation modules are fed into a one-timestep pointer network to get the final answer. At the same time, to tackle the problem of insufficient labeled data, we propose an efficient sampling mechanism to automatically generate more training examples by matching the distribution of candidates between labeled and unlabeled data. We conduct our experiments on a recently released cloze-test dataset CLOTH (Xie et al., 2017), which consists of nearly 100k questions designed by professional teachers. Results show that our method achieves new state-of-the-art performance over previous strong baselines.

pdf bib
Query and Output: Generating Words by Querying Distributed Word Representations for Paraphrase Generation
Shuming Ma | Xu Sun | Wei Li | Sujian Li | Wenjie Li | Xuancheng Ren
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

Most recent approaches use the sequence-to-sequence model for paraphrase generation. The existing sequence-to-sequence model tends to memorize the words and the patterns in the training dataset instead of learning the meaning of the words. Therefore, the generated sentences are often grammatically correct but semantically improper. In this work, we introduce a novel model based on the encoder-decoder framework, called Word Embedding Attention Network (WEAN). Our proposed model generates the words by querying distributed word representations (i.e. neural word embeddings), hoping to capturing the meaning of the according words. Following previous work, we evaluate our model on two paraphrase-oriented tasks, namely text simplification and short text abstractive summarization. Experimental results show that our model outperforms the sequence-to-sequence baseline by the BLEU score of 6.3 and 5.5 on two English text simplification datasets, and the ROUGE-2 F1 score of 5.7 on a Chinese summarization dataset. Moreover, our model achieves state-of-the-art performances on these three benchmark datasets.

2017

pdf bib
A Two-Stage Parsing Method for Text-Level Discourse Analysis
Yizhong Wang | Sujian Li | Houfeng Wang
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Previous work introduced transition-based algorithms to form a unified architecture of parsing rhetorical structures (including span, nuclearity and relation), but did not achieve satisfactory performance. In this paper, we propose that transition-based model is more appropriate for parsing the naked discourse tree (i.e., identifying span and nuclearity) due to data sparsity. At the same time, we argue that relation labeling can benefit from naked tree structure and should be treated elaborately with consideration of three kinds of relations including within-sentence, across-sentence and across-paragraph relations. Thus, we design a pipelined two-stage parsing method for generating an RST tree from text. Experimental results show that our method achieves state-of-the-art performance, especially on span and nuclearity identification.

pdf bib
Tag-Enhanced Tree-Structured Neural Networks for Implicit Discourse Relation Classification
Yizhong Wang | Sujian Li | Jingfeng Yang | Xu Sun | Houfeng Wang
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Identifying implicit discourse relations between text spans is a challenging task because it requires understanding the meaning of the text. To tackle this task, recent studies have tried several deep learning methods but few of them exploited the syntactic information. In this work, we explore the idea of incorporating syntactic parse tree into neural networks. Specifically, we employ the Tree-LSTM model and Tree-GRU model, which is based on the tree structure, to encode the arguments in a relation. And we further leverage the constituent tags to control the semantic composition process in these tree-structured neural networks. Experimental results show that our method achieves state-of-the-art performance on PDTB corpus.

pdf bib
Cascading Multiway Attentions for Document-level Sentiment Classification
Dehong Ma | Sujian Li | Xiaodong Zhang | Houfeng Wang | Xu Sun
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Document-level sentiment classification aims to assign the user reviews a sentiment polarity. Previous methods either just utilized the document content without consideration of user and product information, or did not comprehensively consider what roles the three kinds of information play in text modeling. In this paper, to reasonably use all the information, we present the idea that user, product and their combination can all influence the generation of attentions to words and sentences, when judging the sentiment of a document. With this idea, we propose a cascading multiway attention (CMA) model, where multiple ways of using user and product information are cascaded to influence the generation of attentions on the word and sentence layers. Then, sentences and documents are well modeled by multiple representation vectors, which provide rich information for sentiment classification. Experiments on IMDB and Yelp datasets demonstrate the effectiveness of our model.

pdf bib
Learning to Rank Semantic Coherence for Topic Segmentation
Liang Wang | Sujian Li | Yajuan Lv | Houfeng Wang
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Topic segmentation plays an important role for discourse parsing and information retrieval. Due to the absence of training data, previous work mainly adopts unsupervised methods to rank semantic coherence between paragraphs for topic segmentation. In this paper, we present an intuitive and simple idea to automatically create a “quasi” training dataset, which includes a large amount of text pairs from the same or different documents with different semantic coherence. With the training corpus, we design a symmetric CNN neural network to model text pairs and rank the semantic coherence within the learning to rank framework. Experiments show that our algorithm is able to achieve competitive performance over strong baselines on several real-world datasets.

pdf bib
PKU_ICL at SemEval-2017 Task 10: Keyphrase Extraction with Model Ensemble and External Knowledge
Liang Wang | Sujian Li
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

This paper presents a system that participated in SemEval 2017 Task 10 (subtask A and subtask B): Extracting Keyphrases and Relations from Scientific Publications (Augenstein et al., 2017). Our proposed approach utilizes external knowledge to enrich feature representation of candidate keyphrase, including Wikipedia, IEEE taxonomy and pre-trained word embeddings etc. Ensemble of unsupervised models, random forest and linear models are used for candidate keyphrase ranking and keyphrase type classification. Our system achieves the 3rd place in subtask A and 4th place in subtask B.

2016

pdf bib
News Stream Summarization using Burst Information Networks
Tao Ge | Lei Cui | Baobao Chang | Sujian Li | Ming Zhou | Zhifang Sui
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Recognizing Implicit Discourse Relations via Repeated Reading: Neural Networks with Multi-Level Attention
Yang Liu | Sujian Li
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Capturing Argument Relationship for Chinese Semantic Role Labeling
Lei Sha | Sujian Li | Baobao Chang | Zhifang Sui | Tingsong Jiang
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Encoding Temporal Information for Time-Aware Link Prediction
Tingsong Jiang | Tianyu Liu | Tao Ge | Lei Sha | Sujian Li | Baobao Chang | Zhifang Sui
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Joint Learning Templates and Slots for Event Schema Induction
Lei Sha | Sujian Li | Baobao Chang | Zhifang Sui
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
AttSum: Joint Learning of Focusing and Summarization with Neural Attention
Ziqiang Cao | Wenjie Li | Sujian Li | Furu Wei | Yanran Li
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Query relevance ranking and sentence saliency ranking are the two main tasks in extractive query-focused summarization. Previous supervised summarization systems often perform the two tasks in isolation. However, since reference summaries are the trade-off between relevance and saliency, using them as supervision, neither of the two rankers could be trained well. This paper proposes a novel summarization system called AttSum, which tackles the two tasks jointly. It automatically learns distributed representations for sentences as well as the document cluster. Meanwhile, it applies the attention mechanism to simulate the attentive reading of human behavior when a query is given. Extensive experiments are conducted on DUC query-focused summarization benchmark datasets. Without using any hand-crafted features, AttSum achieves competitive performance. We also observe that the sentences recognized to focus on the query indeed meet the query need.

pdf bib
Towards Time-Aware Knowledge Graph Completion
Tingsong Jiang | Tianyu Liu | Tao Ge | Lei Sha | Baobao Chang | Sujian Li | Zhifang Sui
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Knowledge graph (KG) completion adds new facts to a KG by making inferences from existing facts. Most existing methods ignore the time information and only learn from time-unknown fact triples. In dynamic environments that evolve over time, it is important and challenging for knowledge graph completion models to take into account the temporal aspects of facts. In this paper, we present a novel time-aware knowledge graph completion model that is able to predict links in a KG using both the existing facts and the temporal information of the facts. To incorporate the happening time of facts, we propose a time-aware KG embedding model using temporal order information among facts. To incorporate the valid time of facts, we propose a joint time-aware inference model based on Integer Linear Programming (ILP) using temporal consistencyinformationasconstraints. Wefurtherintegratetwomodelstomakefulluseofglobal temporal information. We empirically evaluate our models on time-aware KG completion task. Experimental results show that our time-aware models achieve the state-of-the-art on temporal facts consistently.

pdf bib
Reading and Thinking: Re-read LSTM Unit for Textual Entailment Recognition
Lei Sha | Baobao Chang | Zhifang Sui | Sujian Li
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Recognizing Textual Entailment (RTE) is a fundamentally important task in natural language processing that has many applications. The recently released Stanford Natural Language Inference (SNLI) corpus has made it possible to develop and evaluate deep neural network methods for the RTE task. Previous neural network based methods usually try to encode the two sentences (premise and hypothesis) and send them together into a multi-layer perceptron to get their entailment type, or use LSTM-RNN to link two sentences together while using attention mechanic to enhance the model’s ability. In this paper, we propose to use the re-read mechanic, which means to read the premise again and again while reading the hypothesis. After read the premise again, the model can get a better understanding of the premise, which can also affect the understanding of the hypothesis. On the contrary, a better understanding of the hypothesis can also affect the understanding of the premise. With the alternative re-read process, the model can “think” of a better decision of entailment type. We designed a new LSTM unit called re-read LSTM (rLSTM) to implement this “thinking” process. Experiments show that we achieve results better than current state-of-the-art equivalents.

pdf bib
RBPB: Regularization-Based Pattern Balancing Method for Event Extraction
Lei Sha | Jing Liu | Chin-Yew Lin | Sujian Li | Baobao Chang | Zhifang Sui
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2015

pdf bib
Component-Enhanced Chinese Character Embeddings
Yanran Li | Wenjie Li | Fei Sun | Sujian Li
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Recognizing Textual Entailment Using Probabilistic Inference
Lei Sha | Sujian Li | Baobao Chang | Zhifang Sui | Tingsong Jiang
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Why Read if You Can Scan? Trigger Scoping Strategy for Biographical Fact Extraction
Dian Yu | Heng Ji | Sujian Li | Chin-Yew Lin
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Bring you to the past: Automatic Generation of Topically Relevant Event Chronicles
Tao Ge | Wenzhe Pei | Heng Ji | Sujian Li | Baobao Chang | Zhifang Sui
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

pdf bib
Context-aware Entity Morph Decoding
Boliang Zhang | Hongzhao Huang | Xiaoman Pan | Sujian Li | Chin-Yew Lin | Heng Ji | Kevin Knight | Zhen Wen | Yizhou Sun | Jiawei Han | Bulent Yener
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

pdf bib
A Dependency-Based Neural Network for Relation Classification
Yang Liu | Furu Wei | Sujian Li | Heng Ji | Ming Zhou | Houfeng Wang
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

pdf bib
A Hierarchical Knowledge Representation for Expert Finding on Social Media
Yanran Li | Wenjie Li | Sujian Li
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

pdf bib
Learning Summary Prior Representation for Extractive Summarization
Ziqiang Cao | Furu Wei | Sujian Li | Wenjie Li | Ming Zhou | Houfeng Wang
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

2014

pdf bib
Joint Learning of Chinese Words, Terms and Keywords
Ziqiang Cao | Sujian Li | Heng Ji
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
Constructing Information Networks Using One Single Model
Qi Li | Heng Ji | Yu Hong | Sujian Li
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
Query-focused Multi-Document Summarization: Combining a Topic Model with Graph-based Semi-supervised Learning
Yanran Li | Sujian Li
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
Text-level Discourse Dependency Parsing
Sujian Li | Liang Wang | Ziqiang Cao | Wenjie Li
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2013

pdf bib
Event-Based Time Label Propagation for Automatic Dating of News Articles
Tao Ge | Baobao Chang | Sujian Li | Zhifang Sui
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
TopicSpam: a Topic-Model based approach for spam detection
Jiwei Li | Claire Cardie | Sujian Li
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Evolutionary Hierarchical Dirichlet Process for Timeline Summarization
Jiwei Li | Sujian Li
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
A Novel Feature-based Bayesian Model for Query Focused Multi-document Summarization
Jiwei Li | Sujian Li
Transactions of the Association for Computational Linguistics, Volume 1

Supervised learning methods and LDA based topic model have been successfully applied in the field of multi-document summarization. In this paper, we propose a novel supervised approach that can incorporate rich sentence features into Bayesian topic models in a principled way, thus taking advantages of both topic model and feature based supervised learning methods. Experimental results on DUC2007, TAC2008 and TAC2009 demonstrate the effectiveness of our approach.

2012

pdf bib
Joint Learning for Coreference Resolution with Markov Logic
Yang Song | Jing Jiang | Wayne Xin Zhao | Sujian Li | Houfeng Wang
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

pdf bib
Update Summarization using a Multi-level Hierarchical Dirichlet Process Model
Jiwei Li | Sujian Li | Xun Wang | Ye Tian | Baobao Chang
Proceedings of COLING 2012

pdf bib
Implicit Discourse Relation Recognition by Selecting Typical Training Examples
Xun Wang | Sujian Li | Jiwei Li | Wenjie Li
Proceedings of COLING 2012

pdf bib
Constructing Chinese Abbreviation Dictionary: A Stacked Approach
Longkai Zhang | Sujian Li | Houfeng Wang | Ni Sun | Xinfan Meng
Proceedings of COLING 2012

pdf bib
The Task 2 of CIPS-SIGHAN 2012 Named Entity Recognition and Disambiguation in Chinese Bakeoff
Zhengyan He | Houfeng Wang | Sujian Li
Proceedings of the Second CIPS-SIGHAN Joint Conference on Chinese Language Processing

2010

pdf bib
A Semi-Supervised Key Phrase Extraction Approach: Learning from Title Phrases through a Document Semantic Network
Decong Li | Sujian Li | Wenjie Li | Wei Wang | Weiguang Qu
Proceedings of the ACL 2010 Conference Short Papers

2006

pdf bib
Interaction between Lexical Base and Ontology with Formal Concept Analysis
Sujian Li | Qin Lu | Wenjie Li | Ruifeng Xu
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

An ontology describes conceptual knowledge in a specific domain. A lexical base collects a repository of words and gives independent definition of concepts. In this paper, we propose to use FCA as a tool to help constructing an ontology through an existing lexical base. We mainly address two issues. The first issue is how to select attributes to visualize the relations between lexical terms. The second issue is how to revise lexical definitions through analysing the relations in the ontology. Thus the focus is on the effect of interaction between a lexical base and an ontology for the purpose of good ontology construction. Finally, experiments have been conducted to verify our ideas.

pdf bib
The Design and Construction of A Chinese Collocation Bank
Ruifeng Xu | Qin Lu | Sujian Li
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

This paper presents an annotated Chinese collocation bank developed at the Hong Kong Polytechnic University. The definition of collocation with good linguistic consistency and good computational operability is first discussed and the properties of collocations are then presented. Secondly, based on the combination of different properties, collocations are classified into four types. Thirdly, the annotation guideline is presented. Fourthly, the implementation issues for collocation bank construction are addressed including the annotation with categorization, dependency and contextual information. Currently, the collocation bank is completed for 3,643 headwords in a 5-million-word corpus.

2005

pdf bib
Experiments of Ontology Construction with Formal Concept Analysis
Sujian Li | Qin Lu | Wenjie Li
Proceedings of OntoLex 2005 - Ontologies and Lexical Resources

pdf bib
双向考察和驗證:并列成分中心語的語義關係和CCD的名詞語義分類体系 (Bidirectional Investigation: The Semantic Relations between the Conjuncts and the Noun Taxonomy in CCD) [In Chinese]
Yunfang Wu | Sujian Li | Yun Li | Shiwen Yu
International Journal of Computational Linguistics & Chinese Language Processing, Volume 10, Number 4, December 2005: Special Issue on Selected Papers from CLSW-5

pdf bib
隱喻性成語的語義映射 (Semantic Mapping in Chinese Metaphorical Idioms) [In Chinese]
Yun Li | Sujian Li | Zhimin Wang | Yunfang Wu
International Journal of Computational Linguistics & Chinese Language Processing, Volume 10, Number 4, December 2005: Special Issue on Selected Papers from CLSW-5

2003

pdf bib
News-Oriented Keyword Indexing with Maximum Entropy Principle
Sujian Li | Houfeng Wang | Shiwen Yu | Chengsheng Xin
Proceedings of the 17th Pacific Asia Conference on Language, Information and Computation

pdf bib
News-Oriented Automatic Chinese Keyword Indexing
Sujian Li | Houfeng Wang | Shiwen Yu | Chengsheng Xin
Proceedings of the Second SIGHAN Workshop on Chinese Language Processing

2002

pdf bib
基於《知網》的辭彙語義相似度計算 (Word Similarity Computing Based on How-net) [In Chinese]
Qun Liu | Sujian Li
International Journal of Computational Linguistics & Chinese Language Processing, Volume 7, Number 2, August 2002: Special Issue on Computational Chinese Lexical Semantics