Yansong Feng


2021

pdf bib
Learning to Organize a Bag of Words into Sentences with Neural Networks: An Empirical Study
Chongyang Tao | Shen Gao | Juntao Li | Yansong Feng | Dongyan Zhao | Rui Yan
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Sequential information, a.k.a., orders, is assumed to be essential for processing a sequence with recurrent neural network or convolutional neural network based encoders. However, is it possible to encode natural languages without orders? Given a bag of words from a disordered sentence, humans may still be able to understand what those words mean by reordering or reconstructing them. Inspired by such an intuition, in this paper, we perform a study to investigate how “order” information takes effects in natural language learning. By running comprehensive comparisons, we quantitatively compare the ability of several representative neural models to organize sentences from a bag of words under three typical scenarios, and summarize some empirical findings and challenges, which can shed light on future research on this line of work.

pdf bib
Lattice-BERT: Leveraging Multi-Granularity Representations in Chinese Pre-trained Language Models
Yuxuan Lai | Yijia Liu | Yansong Feng | Songfang Huang | Dongyan Zhao
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Chinese pre-trained language models usually process text as a sequence of characters, while ignoring more coarse granularity, e.g., words. In this work, we propose a novel pre-training paradigm for Chinese — Lattice-BERT, which explicitly incorporates word representations along with characters, thus can model a sentence in a multi-granularity manner. Specifically, we construct a lattice graph from the characters and words in a sentence and feed all these text units into transformers. We design a lattice position attention mechanism to exploit the lattice structures in self-attention layers. We further propose a masked segment prediction task to push the model to learn from rich but redundant information inherent in lattices, while avoiding learning unexpected tricks. Experiments on 11 Chinese natural language understanding tasks show that our model can bring an average increase of 1.5% under the 12-layer setting, which achieves new state-of-the-art among base-size models on the CLUE benchmarks. Further analysis shows that Lattice-BERT can harness the lattice structures, and the improvement comes from the exploration of redundant information and multi-granularity representations. Our code will be available at https://github.com/alibaba/pretrained-language-models/LatticeBERT.

pdf bib
Everything Has a Cause: Leveraging Causal Inference in Legal Text Analysis
Xiao Liu | Da Yin | Yansong Feng | Yuting Wu | Dongyan Zhao
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Causal inference is the process of capturing cause-effect relationship among variables. Most existing works focus on dealing with structured data, while mining causal relationship among factors from unstructured data, like text, has been less examined, but is of great importance, especially in the legal domain. In this paper, we propose a novel Graph-based Causal Inference (GCI) framework, which builds causal graphs from fact descriptions without much human involvement and enables causal inference to facilitate legal practitioners to make proper decisions. We evaluate the framework on a challenging similar charge disambiguation task. Experimental results show that GCI can capture the nuance from fact descriptions among multiple confusing charges and provide explainable discrimination, especially in few-shot settings. We also observe that the causal knowledge contained in GCI can be effectively injected into powerful neural networks for better performance and interpretability.

pdf bib
Exploring Distantly-Labeled Rationales in Neural Network Models
Quzhe Huang | Shengqi Zhu | Yansong Feng | Dongyan Zhao
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Recent studies strive to incorporate various human rationales into neural networks to improve model performance, but few pay attention to the quality of the rationales. Most existing methods distribute their models’ focus to distantly-labeled rationale words entirely and equally, while ignoring the potential important non-rationale words and not distinguishing the importance of different rationale words. In this paper, we propose two novel auxiliary loss functions to make better use of distantly-labeled rationales, which encourage models to maintain their focus on important words beyond labeled rationales (PINs) and alleviate redundant training on non-helpful rationales (NoIRs). Experiments on two representative classification tasks show that our proposed methods can push a classification model to effectively learn crucial clues from non-perfect rationales while maintaining the ability to spread its focus to other unlabeled important words, thus significantly outperform existing methods.

pdf bib
Three Sentences Are All You Need: Local Path Enhanced Document Relation Extraction
Quzhe Huang | Shengqi Zhu | Yansong Feng | Yuan Ye | Yuxuan Lai | Dongyan Zhao
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

Document-level Relation Extraction (RE) is a more challenging task than sentence RE as it often requires reasoning over multiple sentences. Yet, human annotators usually use a small number of sentences to identify the relationship between a given entity pair. In this paper, we present an embarrassingly simple but effective method to heuristically select evidence sentences for document-level RE, which can be easily combined with BiLSTM to achieve good performance on benchmark datasets, even better than fancy graph neural network based methods. We have released our code at https://github.com/AndrewZhe/Three-Sentences-Are-All-You-Need.

pdf bib
Why Machine Reading Comprehension Models Learn Shortcuts?
Yuxuan Lai | Chen Zhang | Yansong Feng | Quzhe Huang | Dongyan Zhao
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2020

pdf bib
Exploring Question-Specific Rewards for Generating Deep Questions
Yuxi Xie | Liangming Pan | Dongzhe Wang | Min-Yen Kan | Yansong Feng
Proceedings of the 28th International Conference on Computational Linguistics

Recent question generation (QG) approaches often utilize the sequence-to-sequence framework (Seq2Seq) to optimize the log likelihood of ground-truth questions using teacher forcing. However, this training objective is inconsistent with actual question quality, which is often reflected by certain global properties such as whether the question can be answered by the document. As such, we directly optimize for QG-specific objectives via reinforcement learning to improve question quality. We design three different rewards that target to improve the fluency, relevance, and answerability of generated questions. We conduct both automatic and human evaluations in addition to thorough analysis to explore the effect of each QG-specific reward. We find that optimizing on question-specific rewards generally leads to better performance in automatic evaluation metrics. However, only the rewards that correlate well with human judgement (e.g., relevance) lead to real improvement in question quality. Optimizing for the others, especially answerability, introduces incorrect bias to the model, resulting in poorer question quality. The code is publicly available at https://github.com/YuxiXie/RL-for-Question-Generation.

pdf bib
Towards Context-Aware Code Comment Generation
Xiaohan Yu | Quzhe Huang | Zheng Wang | Yansong Feng | Dongyan Zhao
Findings of the Association for Computational Linguistics: EMNLP 2020

Code comments are vital for software maintenance and comprehension, but many software projects suffer from the lack of meaningful and up-to-date comments in practice. This paper presents a novel approach to automatically generate code comments at a function level by targeting object-oriented programming languages. Unlike prior work that only uses information locally available within the target function, our approach leverages broader contextual information by considering all other functions of the same class. To propagate and integrate information beyond the scope of the target function, we design a novel learning framework based on the bidirectional gated recurrent unit and a graph attention network with a pointer mechanism. We apply our approach to produce code comments for Java methods and compare it against four strong baseline methods. Experimental results show that our approach outperforms most methods by a large margin and achieves a comparable result with the state-of-the-art method.

pdf bib
Semantic Graphs for Generating Deep Questions
Liangming Pan | Yuxi Xie | Yansong Feng | Tat-Seng Chua | Min-Yen Kan
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

This paper proposes the problem of Deep Question Generation (DQG), which aims to generate complex questions that require reasoning over multiple pieces of information about the input passage. In order to capture the global structure of the document and facilitate reasoning, we propose a novel framework that first constructs a semantic-level graph for the input document and then encodes the semantic graph by introducing an attention-based GGNN (Att-GGNN). Afterward, we fuse the document-level and graph-level representations to perform joint training of content selection and question decoding. On the HotpotQA deep-question centric dataset, our model greatly improves performance over questions requiring reasoning over multiple facts, leading to state-of-the-art performance. The code is publicly available at https://github.com/WING-NUS/SG-Deep-Question-Generation.

pdf bib
Neighborhood Matching Network for Entity Alignment
Yuting Wu | Xiao Liu | Yansong Feng | Zheng Wang | Dongyan Zhao
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Structural heterogeneity between knowledge graphs is an outstanding challenge for entity alignment. This paper presents Neighborhood Matching Network (NMN), a novel entity alignment framework for tackling the structural heterogeneity challenge. NMN estimates the similarities between entities to capture both the topological structure and the neighborhood difference. It provides two innovative components for better learning representations for entity alignment. It first uses a novel graph sampling method to distill a discriminative neighborhood for each entity. It then adopts a cross-graph neighborhood matching module to jointly encode the neighborhood difference for a given entity pair. Such strategies allow NMN to effectively construct matching-oriented entity representations while ignoring noisy neighbors that have a negative impact on the alignment task. Extensive experiments performed on three entity alignment datasets show that NMN can well estimate the neighborhood similarity in more tough cases and significantly outperforms 12 previous state-of-the-art methods.

pdf bib
Understanding Procedural Text using Interactive Entity Networks
Jizhi Tang | Yansong Feng | Dongyan Zhao
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

The task of procedural text comprehension aims to understand the dynamic nature of entities/objects in a process. Here, the key is to track how the entities interact with each other and how their states are changing along the procedure. Recent efforts have made great progress to track multiple entities in a procedural text, but usually treat each entity separately and ignore the fact that there are often multiple entities interacting with each other during one process, some of which are even explicitly mentioned. In this paper, we propose a novel Interactive Entity Network (IEN), which is a recurrent network with memory equipped cells for state tracking. In each IEN cell, we maintain different attention matrices through specific memories to model different types of entity interactions. Importantly, we can update these memories in a sequential manner so as to explore the causal relationship between entity actions and subsequent state changes. We evaluate our model on a benchmark dataset, and the results show that IEN outperforms state-of-the-art models by precisely capturing the interactions of multiple entities and explicitly leverage the relationship between entity interactions and subsequent state changes.

2019

pdf bib
Jointly Learning Entity and Relation Representations for Entity Alignment
Yuting Wu | Xiao Liu | Yansong Feng | Zheng Wang | Dongyan Zhao
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Entity alignment is a viable means for integrating heterogeneous knowledge among different knowledge graphs (KGs). Recent developments in the field often take an embedding-based approach to model the structural information of KGs so that entity alignment can be easily performed in the embedding space. However, most existing works do not explicitly utilize useful relation representations to assist in entity alignment, which, as we will show in the paper, is a simple yet effective way for improving entity alignment. This paper presents a novel joint learning framework for entity alignment. At the core of our approach is a Graph Convolutional Network (GCN) based framework for learning both entity and relation representations. Rather than relying on pre-aligned relation seeds to learn relation representations, we first approximate them using entity embeddings learned by the GCN. We then incorporate the relation approximation into entities to iteratively learn better representations for both. Experiments performed on three real-world cross-lingual datasets show that our approach substantially outperforms state-of-the-art entity alignment methods.

pdf bib
Sampling Matters! An Empirical Study of Negative Sampling Strategies for Learning of Matching Models in Retrieval-based Dialogue Systems
Jia Li | Chongyang Tao | Wei Wu | Yansong Feng | Dongyan Zhao | Rui Yan
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

We study how to sample negative examples to automatically construct a training set for effective model learning in retrieval-based dialogue systems. Following an idea of dynamically adapting negative examples to matching models in learning, we consider four strategies including minimum sampling, maximum sampling, semi-hard sampling, and decay-hard sampling. Empirical studies on two benchmarks with three matching models indicate that compared with the widely used random sampling strategy, although the first two strategies lead to performance drop, the latter two ones can bring consistent improvement to the performance of all the models on both benchmarks.

pdf bib
Learning to Update Knowledge Graphs by Reading News
Jizhi Tang | Yansong Feng | Dongyan Zhao
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

News streams contain rich up-to-date information which can be used to update knowledge graphs (KGs). Most current text-based KG updating methods rely on elaborately designed information extraction (IE) systems and carefully crafted rules, which are often domain-specific and hard to maintain. Besides, such methods often hardly pay enough attention to the implicit information that lies underneath texts. In this paper, we propose a novel neural network method, GUpdater, to tackle these problems. GUpdater is built upon graph neural networks (GNNs) with a text-based attention mechanism to guide the updating message passing through the KG structures. Experiments on a real-world KG updating dataset show that our model can effectively broadcast the news information to the KG structures and perform necessary link-adding or link-deleting operations to ensure the KG up-to-date according to news snippets.

pdf bib
Easy First Relation Extraction with Information Redundancy
Shuai Ma | Gang Wang | Yansong Feng | Jinpeng Huai
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Many existing relation extraction (RE) models make decisions globally using integer linear programming (ILP). However, it is nontrivial to make use of integer linear programming as a blackbox solver for RE. Its cost of time and memory may become unacceptable with the increase of data scale, and redundant information needs to be encoded cautiously for ILP. In this paper, we propose an easy first approach for relation extraction with information redundancies, embedded in the results produced by local sentence level extractors, during which conflict decisions are resolved with domain and uniqueness constraints. Information redundancies are leveraged to support both easy first collective inference for easy decisions in the first stage and ILP for hard decisions in a subsequent stage. Experimental study shows that our approach improves the efficiency and accuracy of RE, and outperforms both ILP and neural network-based methods.

pdf bib
Generating Classical Chinese Poems from Vernacular Chinese
Zhichao Yang | Pengshan Cai | Yansong Feng | Fei Li | Weijiang Feng | Elena Suet-Ying Chiu | Hong Yu
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Classical Chinese poetry is a jewel in the treasure house of Chinese culture. Previous poem generation models only allow users to employ keywords to interfere the meaning of generated poems, leaving the dominion of generation to the model. In this paper, we propose a novel task of generating classical Chinese poems from vernacular, which allows users to have more control over the semantic of generated poems. We adapt the approach of unsupervised machine translation (UMT) to our task. We use segmentation-based padding and reinforcement learning to address under-translation and over-translation respectively. According to experiments, our approach significantly improve the perplexity and BLEU compared with typical UMT models. Furthermore, we explored guidelines on how to write the input vernacular to generate better poems. Human evaluation showed our approach can generate high-quality poems which are comparable to amateur poems.

pdf bib
Enhancing Key-Value Memory Neural Networks for Knowledge Based Question Answering
Kun Xu | Yuxuan Lai | Yansong Feng | Zhiguo Wang
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Traditional Key-value Memory Neural Networks (KV-MemNNs) are proved to be effective to support shallow reasoning over a collection of documents in domain specific Question Answering or Reading Comprehension tasks. However, extending KV-MemNNs to Knowledge Based Question Answering (KB-QA) is not trivia, which should properly decompose a complex question into a sequence of queries against the memory, and update the query representations to support multi-hop reasoning over the memory. In this paper, we propose a novel mechanism to enable conventional KV-MemNNs models to perform interpretable reasoning for complex questions. To achieve this, we design a new query updating strategy to mask previously-addressed memory information from the query representations, and introduce a novel STOP strategy to avoid invalid or repeated memory reading without strong annotation signals. This also enables KV-MemNNs to produce structured queries and work in a semantic parsing fashion. Experimental results on benchmark datasets show that our solution, trained with question-answer pairs only, can provide conventional KV-MemNNs models with better reasoning abilities on complex questions, and achieve state-of-art performances.

pdf bib
Cross-lingual Knowledge Graph Alignment via Graph Matching Neural Network
Kun Xu | Liwei Wang | Mo Yu | Yansong Feng | Yan Song | Zhiguo Wang | Dong Yu
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Previous cross-lingual knowledge graph (KG) alignment studies rely on entity embeddings derived only from monolingual KG structural information, which may fail at matching entities that have different facts in two KGs. In this paper, we introduce the topic entity graph, a local sub-graph of an entity, to represent entities with their contextual information in KG. From this view, the KB-alignment task can be formulated as a graph matching problem; and we further propose a graph-attention based solution, which first matches all entities in two topic entity graphs, and then jointly model the local matching information to derive a graph-level matching vector. Experiments show that our model outperforms previous state-of-the-art methods by a large margin.

pdf bib
Learning a Matching Model with Co-teaching for Multi-turn Response Selection in Retrieval-based Dialogue Systems
Jiazhan Feng | Chongyang Tao | Wei Wu | Yansong Feng | Dongyan Zhao | Rui Yan
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

We study learning of a matching model for response selection in retrieval-based dialogue systems. The problem is equally important with designing the architecture of a model, but is less explored in existing literature. To learn a robust matching model from noisy training data, we propose a general co-teaching framework with three specific teaching strategies that cover both teaching with loss functions and teaching with data curriculum. Under the framework, we simultaneously learn two matching models with independent training sets. In each iteration, one model transfers the knowledge learned from its training set to the other model, and at the same time receives the guide from the other model on how to overcome noise in training. Through being both a teacher and a student, the two models learn from each other and get improved together. Evaluation results on two public data sets indicate that the proposed learning approach can generally and significantly improve the performance of existing matching models.

2018

pdf bib
Natural Answer Generation with Heterogeneous Memory
Yao Fu | Yansong Feng
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

Memory augmented encoder-decoder framework has achieved promising progress for natural language generation tasks. Such frameworks enable a decoder to retrieve from a memory during generation. However, less research has been done to take care of the memory contents from different sources, which are often of heterogeneous formats. In this work, we propose a novel attention mechanism to encourage the decoder to actively interact with the memory by taking its heterogeneity into account. Our solution attends across the generated history and memory to explicitly avoid repetition, and introduce related knowledge to enrich our generated sentences. Experiments on the answer sentence generation task show that our method can effectively explore heterogeneous memory to produce readable and meaningful answer sentences while maintaining high coverage for given answer information.

pdf bib
Marrying Up Regular Expressions with Neural Networks: A Case Study for Spoken Language Understanding
Bingfeng Luo | Yansong Feng | Zheng Wang | Songfang Huang | Rui Yan | Dongyan Zhao
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

The success of many natural language processing (NLP) tasks is bound by the number and quality of annotated data, but there is often a shortage of such training data. In this paper, we ask the question: “Can we combine a neural network (NN) with regular expressions (RE) to improve supervised learning for NLP?”. In answer, we develop novel methods to exploit the rich expressiveness of REs at different levels within a NN, showing that the combination significantly enhances the learning effectiveness when a small number of training examples are available. We evaluate our approach by applying it to spoken language understanding for intent detection and slot filling. Experimental results show that our approach is highly effective in exploiting the available training data, giving a clear boost to the RE-unaware NN.

pdf bib
Modeling discourse cohesion for discourse parsing via memory network
Yanyan Jia | Yuan Ye | Yansong Feng | Yuxuan Lai | Rui Yan | Dongyan Zhao
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Identifying long-span dependencies between discourse units is crucial to improve discourse parsing performance. Most existing approaches design sophisticated features or exploit various off-the-shelf tools, but achieve little success. In this paper, we propose a new transition-based discourse parser that makes use of memory networks to take discourse cohesion into account. The automatically captured discourse cohesion benefits discourse parsing, especially for long span scenarios. Experiments on the RST discourse treebank show that our method outperforms traditional featured based methods, and the memory based discourse cohesion can improve the overall parsing performance significantly.

pdf bib
SQL-to-Text Generation with Graph-to-Sequence Model
Kun Xu | Lingfei Wu | Zhiguo Wang | Yansong Feng | Vadim Sheinin
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Previous work approaches the SQL-to-text generation task using vanilla Seq2Seq models, which may not fully capture the inherent graph-structured information in SQL query. In this paper, we propose a graph-to-sequence model to encode the global structure information into node embeddings. This model can effectively learn the correlation between the SQL query pattern and its interpretation. Experimental results on the WikiSQL dataset and Stackoverflow dataset show that our model outperforms the Seq2Seq and Tree2Seq baselines, achieving the state-of-the-art performance.

pdf bib
Multi-grained Attention Network for Aspect-Level Sentiment Classification
Feifan Fan | Yansong Feng | Dongyan Zhao
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

We propose a novel multi-grained attention network (MGAN) model for aspect level sentiment classification. Existing approaches mostly adopt coarse-grained attention mechanism, which may bring information loss if the aspect has multiple words or larger context. We propose a fine-grained attention mechanism, which can capture the word-level interaction between aspect and context. And then we leverage the fine-grained and coarse-grained attention mechanisms to compose the MGAN framework. Moreover, unlike previous works which train each aspect with its context separately, we design an aspect alignment loss to depict the aspect-level interactions among the aspects that have the same context. We evaluate the proposed approach on three datasets: laptop and restaurant are from SemEval 2014, and the last one is a twitter dataset. Experimental results show that the multi-grained attention network consistently outperforms the state-of-the-art methods on all three datasets. We also conduct experiments to evaluate the effectiveness of aspect alignment loss, which indicates the aspect-level interactions can bring extra useful information and further improve the performance.

2017

pdf bib
Learning with Noise: Enhance Distantly Supervised Relation Extraction with Dynamic Transition Matrix
Bingfeng Luo | Yansong Feng | Zheng Wang | Zhanxing Zhu | Songfang Huang | Rui Yan | Dongyan Zhao
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Distant supervision significantly reduces human efforts in building training data for many classification tasks. While promising, this technique often introduces noise to the generated training data, which can severely affect the model performance. In this paper, we take a deep look at the application of distant supervision in relation extraction. We show that the dynamic transition matrix can effectively characterize the noise in the training data built by distant supervision. The transition matrix can be effectively trained using a novel curriculum learning based method without any direct supervision about the noise. We thoroughly evaluate our approach under a wide range of extraction scenarios. Experimental results show that our approach consistently improves the extraction results and outperforms the state-of-the-art in various evaluation scenarios.

pdf bib
How to Make Context More Useful? An Empirical Study on Context-Aware Neural Conversational Models
Zhiliang Tian | Rui Yan | Lili Mou | Yiping Song | Yansong Feng | Dongyan Zhao
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Generative conversational systems are attracting increasing attention in natural language processing (NLP). Recently, researchers have noticed the importance of context information in dialog processing, and built various models to utilize context. However, there is no systematic comparison to analyze how to use context effectively. In this paper, we conduct an empirical study to compare various models and investigate the effect of context information in dialog systems. We also propose a variant that explicitly weights context vectors by context-query relevance, outperforming the other baselines.

pdf bib
Towards Implicit Content-Introducing for Generative Short-Text Conversation Systems
Lili Yao | Yaoyuan Zhang | Yansong Feng | Dongyan Zhao | Rui Yan
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

The study on human-computer conversation systems is a hot research topic nowadays. One of the prevailing methods to build the system is using the generative Sequence-to-Sequence (Seq2Seq) model through neural networks. However, the standard Seq2Seq model is prone to generate trivial responses. In this paper, we aim to generate a more meaningful and informative reply when answering a given question. We propose an implicit content-introducing method which incorporates additional information into the Seq2Seq model in a flexible way. Specifically, we fuse the general decoding and the auxiliary cue word information through our proposed hierarchical gated fusion unit. Experiments on real-life data demonstrate that our model consistently outperforms a set of competitive baselines in terms of BLEU scores and human evaluation.

pdf bib
Learning to Predict Charges for Criminal Cases with Legal Basis
Bingfeng Luo | Yansong Feng | Jianbo Xu | Xiang Zhang | Dongyan Zhao
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

The charge prediction task is to determine appropriate charges for a given case, which is helpful for legal assistant systems where the user input is fact description. We argue that relevant law articles play an important role in this task, and therefore propose an attention-based neural network method to jointly model the charge prediction task and the relevant article extraction task in a unified framework. The experimental results show that, besides providing legal basis, the relevant articles can also clearly improve the charge prediction results, and our full model can effectively predict appropriate charges for cases with different expression styles.

2016

bib
Methods and Theories for Large-scale Structured Prediction
Xu Sun | Yansong Feng
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts

Many important NLP tasks are casted as structured prediction problems, and try to predict certain forms of structured output from the input. Examples of structured prediction include POS tagging, named entity recognition, PCFG parsing, dependency parsing, machine translation, and many others. When apply structured prediction to a specific NLP task, there are the following challenges:1. Model selection: Among various models/algorithms with different characteristics, which one should we choose for a specific NLP task?2. Training: How to train the model parameters effectively and efficiently?3. Overfitting: To achieve good accuracy on test data, it is important to control the overfitting from the training data. How to control the overfitting risk for structured prediction?This tutorial will provide a clear overview of recent advances in structured prediction methods and theories, and address the above issues when we apply structured prediction to NLP tasks. We will introduce large margin methods (e.g., perceptrons, MIRA), graphical models (e.g., CRFs), and deep learning methods (e.g., RNN, LSTM), and show the respective advantages and disadvantages for NLP applications. For the training algorithms, we will introduce online/ stochastic training methods, and we will introduce parallel online/stochastic learning algorithms and theories to speed up the training (e.g., the Hogwild algorithm). For controlling the overfitting from training data, we will introduce the weight regularization methods, structure regularization, and implicit regularization methods.

pdf bib
Question Answering on Freebase via Relation Extraction and Textual Evidence
Kun Xu | Siva Reddy | Yansong Feng | Songfang Huang | Dongyan Zhao
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Hybrid Question Answering over Knowledge Base and Free Text
Kun Xu | Yansong Feng | Songfang Huang | Dongyan Zhao
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Recent trend in question answering (QA) systems focuses on using structured knowledge bases (KBs) to find answers. While these systems are able to provide more precise answers than information retrieval (IR) based QA systems, the natural incompleteness of KB inevitably limits the question scope that the system can answer. In this paper, we present a hybrid question answering (hybrid-QA) system which exploits both structured knowledge base and free text to answer a question. The main challenge is to recognize the meaning of a question using these two resources, i.e., structured KB and free text. To address this, we map relational phrases to KB predicates and textual relations simultaneously, and further develop an integer linear program (ILP) model to infer on these candidates and provide a globally optimal solution. Experiments on benchmark datasets show that our system can benefit from both structured KB and free text, outperforming the state-of-the-art systems.

2015

pdf bib
Semantic Relation Classification via Convolutional Neural Networks with Simple Negative Sampling
Kun Xu | Yansong Feng | Songfang Huang | Dongyan Zhao
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Semantic Interpretation of Superlative Expressions via Structured Knowledge Bases
Sheng Zhang | Yansong Feng | Songfang Huang | Kun Xu | Zhe Han | Dongyan Zhao
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

2014

pdf bib
Joint Inference for Knowledge Base Population
Liwei Chen | Yansong Feng | Jinghui Mo | Songfang Huang | Dongyan Zhao
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
Encoding Relation Requirements for Relation Extraction via Joint Inference
Liwei Chen | Yansong Feng | Songfang Huang | Yong Qin | Dongyan Zhao
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2012

pdf bib
Explore Person Specific Evidence in Web Person Name Disambiguation
Liwei Chen | Yansong Feng | Lei Zou | Dongyan Zhao
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

pdf bib
Towards Automatic Construction of Knowledge Bases from Chinese Online Resources
Liwei Chen | Yansong Feng | Yidong Chen | Lei Zou | Dongyan Zhao
Proceedings of ACL 2012 Student Research Workshop

2010

pdf bib
Visual Information in Semantic Representation
Yansong Feng | Mirella Lapata
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
Topic Models for Image Annotation and Text Illustration
Yansong Feng | Mirella Lapata
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
How Many Words Is a Picture Worth? Automatic Caption Generation for News Images
Yansong Feng | Mirella Lapata
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

pdf bib
Title Generation with Quasi-Synchronous Grammar
Kristian Woodsend | Yansong Feng | Mirella Lapata
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

2008

pdf bib
Automatic Image Annotation Using Auxiliary Text Information
Yansong Feng | Mirella Lapata
Proceedings of ACL-08: HLT