Yue Zhang


2021

pdf bib
Towards Navigation by Reasoning over Spatial Configurations
Yue Zhang | Quan Guo | Parisa Kordjamshidi
Proceedings of Second International Combined Workshop on Spatial Language Understanding and Grounded Communication for Robotics

We deal with the navigation problem where the agent follows natural language instructions while observing the environment. Focusing on language understanding, we show the importance of spatial semantics in grounding navigation instructions into visual perceptions. We propose a neural agent that uses the elements of spatial configurations and investigate their influence on the navigation agent’s reasoning ability. Moreover, we model the sequential execution order and align visual objects with spatial configurations in the instruction. Our neural agent improves strong baselines on the seen environments and shows competitive performance on the unseen environments. Additionally, the experimental results demonstrate that explicit modeling of spatial semantic elements in the instructions can improve the grounding and spatial reasoning of the model.

pdf bib
A Unified Span-Based Approach for Opinion Mining with Syntactic Constituents
Qingrong Xia | Bo Zhang | Rui Wang | Zhenghua Li | Yue Zhang | Fei Huang | Luo Si | Min Zhang
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Fine-grained opinion mining (OM) has achieved increasing attraction in the natural language processing (NLP) community, which aims to find the opinion structures of “Who expressed what opinions towards what” in one sentence. In this work, motivated by its span-based representations of opinion expressions and roles, we propose a unified span-based approach for the end-to-end OM setting. Furthermore, inspired by the unified span-based formalism of OM and constituent parsing, we explore two different methods (multi-task learning and graph convolutional neural network) to integrate syntactic constituents into the proposed model to help OM. We conduct experiments on the commonly used MPQA 2.0 dataset. The experimental results show that our proposed unified span-based approach achieves significant improvements over previous works in the exact F1 score and reduces the number of wrongly-predicted opinion expressions and roles, showing the effectiveness of our method. In addition, incorporating the syntactic constituents achieves promising improvements over the strong baseline enhanced by contextualized word representations.

pdf bib
Exploring the Efficacy of Automatically Generated Counterfactuals for Sentiment Analysis
Linyi Yang | Jiazheng Li | Padraig Cunningham | Yue Zhang | Barry Smyth | Ruihai Dong
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

While state-of-the-art NLP models have been achieving the excellent performance of a wide range of tasks in recent years, important questions are being raised about their robustness and their underlying sensitivity to systematic biases that may exist in their training and test data. Such issues come to be manifest in performance problems when faced with out-of-distribution data in the field. One recent solution has been to use counterfactually augmented datasets in order to reduce any reliance on spurious patterns that may exist in the original data. Producing high-quality augmented data can be costly and time-consuming as it usually needs to involve human feedback and crowdsourcing efforts. In this work, we propose an alternative by describing and evaluating an approach to automatically generating counterfactual data for the purpose of data augmentation and explanation. A comprehensive evaluation on several different datasets and using a variety of state-of-the-art benchmarks demonstrate how our approach can achieve significant improvements in model performance when compared to models training on the original data and even when compared to models trained with the benefit of human-generated augmented data.

pdf bib
Can Generative Pre-trained Language Models Serve As Knowledge Bases for Closed-book QA?
Cunxiang Wang | Pai Liu | Yue Zhang
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Recent work has investigated the interesting question using pre-trained language models (PLMs) as knowledge bases for answering open questions. However, existing work is limited in using small benchmarks with high test-train overlaps. We construct a new dataset of closed-book QA using SQuAD, and investigate the performance of BART. Experiments show that it is challenging for BART to remember training facts in high precision, and also challenging to answer closed-book questions even if relevant knowledge is retained. Some promising directions are found, including decoupling the knowledge memorizing process and the QA finetune process, forcing the model to recall relevant knowledge when question answering.

pdf bib
G-Transformer for Document-Level Machine Translation
Guangsheng Bao | Yue Zhang | Zhiyang Teng | Boxing Chen | Weihua Luo
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Document-level MT models are still far from satisfactory. Existing work extend translation unit from single sentence to multiple sentences. However, study shows that when we further enlarge the translation unit to a whole document, supervised training of Transformer can fail. In this paper, we find such failure is not caused by overfitting, but by sticking around local minima during training. Our analysis shows that the increased complexity of target-to-source attention is a reason for the failure. As a solution, we propose G-Transformer, introducing locality assumption as an inductive bias into Transformer, reducing the hypothesis space of the attention from target to source. Experiments show that G-Transformer converges faster and more stably than Transformer, achieving new state-of-the-art BLEU scores for both nonpretraining and pre-training settings on three benchmark datasets.

pdf bib
End-to-End AMR Coreference Resolution
Qiankun Fu | Linfeng Song | Wenyu Du | Yue Zhang
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Although parsing to Abstract Meaning Representation (AMR) has become very popular and AMR has been shown effective on the many sentence-level downstream tasks, little work has studied how to generate AMRs that can represent multi-sentence information. We introduce the first end-to-end AMR coreference resolution model in order to build multi-sentence AMRs. Compared with the previous pipeline and rule-based approaches, our model alleviates error propagation and it is more robust for both in-domain and out-domain situations. Besides, the document-level AMRs obtained by our model can significantly improve over the AMRs generated by a rule-based method (Liu et al., 2015) on text summarization.

pdf bib
Semantic Representation for Dialogue Modeling
Xuefeng Bai | Yulong Chen | Linfeng Song | Yue Zhang
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Although neural models have achieved competitive results in dialogue systems, they have shown limited ability in representing core semantics, such as ignoring important entities. To this end, we exploit Abstract Meaning Representation (AMR) to help dialogue modeling. Compared with the textual input, AMR explicitly provides core semantic knowledge and reduces data sparsity. We develop an algorithm to construct dialogue-level AMR graphs from sentence-level AMRs and explore two ways to incorporate AMRs into dialogue systems. Experimental results on both dialogue understanding and response generation tasks show the superiority of our model. To our knowledge, we are the first to leverage a formal semantic representation into neural dialogue modeling.

pdf bib
On Compositional Generalization of Neural Machine Translation
Yafu Li | Yongjing Yin | Yulong Chen | Yue Zhang
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Modern neural machine translation (NMT) models have achieved competitive performance in standard benchmarks such as WMT. However, there still exist significant issues such as robustness, domain generalization, etc. In this paper, we study NMT models from the perspective of compositional generalization by building a benchmark dataset, CoGnition, consisting of 216k clean and consistent sentence pairs. We quantitatively analyze effects of various factors using compound translation error rate, then demonstrate that the NMT model fails badly on compositional generalization, although it performs remarkably well under traditional metrics.

pdf bib
Lexicon Enhanced Chinese Sequence Labeling Using BERT Adapter
Wei Liu | Xiyan Fu | Yue Zhang | Wenming Xiao
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Lexicon information and pre-trained models, such as BERT, have been combined to explore Chinese sequence labeling tasks due to their respective strengths. However, existing methods solely fuse lexicon features via a shallow and random initialized sequence layer and do not integrate them into the bottom layers of BERT. In this paper, we propose Lexicon Enhanced BERT (LEBERT) for Chinese sequence labeling, which integrates external lexicon knowledge into BERT layers directly by a Lexicon Adapter layer. Compared with existing methods, our model facilitates deep lexicon knowledge fusion at the lower layers of BERT. Experiments on ten Chinese datasets of three tasks including Named Entity Recognition, Word Segmentation, and Part-of-Speech Tagging, show that LEBERT achieves state-of-the-art results.

pdf bib
TextFlint: Unified Multilingual Robustness Evaluation Toolkit for Natural Language Processing
Xiao Wang | Qin Liu | Tao Gui | Qi Zhang | Yicheng Zou | Xin Zhou | Jiacheng Ye | Yongxin Zhang | Rui Zheng | Zexiong Pang | Qinzhuo Wu | Zhengyan Li | Chong Zhang | Ruotian Ma | Zichu Fei | Ruijian Cai | Jun Zhao | Xingwu Hu | Zhiheng Yan | Yiding Tan | Yuan Hu | Qiyuan Bian | Zhihua Liu | Shan Qin | Bolin Zhu | Xiaoyu Xing | Jinlan Fu | Yue Zhang | Minlong Peng | Xiaoqing Zheng | Yaqian Zhou | Zhongyu Wei | Xipeng Qiu | Xuanjing Huang
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations

TextFlint is a multilingual robustness evaluation toolkit for NLP tasks that incorporates universal text transformation, task-specific transformation, adversarial attack, subpopulation, and their combinations to provide comprehensive robustness analyses. This enables practitioners to automatically evaluate their models from various aspects or to customize their evaluations as desired with just a few lines of code. TextFlint also generates complete analytical reports as well as targeted augmented data to address the shortcomings of the model in terms of its robustness. To guarantee acceptability, all the text transformations are linguistically based and all the transformed data selected (up to 100,000 texts) scored highly under human evaluation. To validate the utility, we performed large-scale empirical evaluations (over 67,000) on state-of-the-art deep learning models, classic supervised methods, and real-world systems. The toolkit is already available at https://github.com/textflint with all the evaluation results demonstrated at textflint.io.

pdf bib
DialogSum Challenge: Summarizing Real-Life Scenario Dialogues
Yulong Chen | Yang Liu | Yue Zhang
Proceedings of the 14th International Conference on Natural Language Generation

We propose a shared task on summarizing real-life scenario dialogues, DialogSum Challenge, to encourage researchers to address challenges in dialogue summarization, which has been less studied by the summarization community. Real-life scenario dialogue summarization has a wide potential application prospect in chat-bot and personal assistant. It contains unique challenges such as special discourse structure, coreference, pragmatics, and social common sense, which require specific representation learning technologies to deal with. We carefully annotate a large-scale dialogue summarization dataset based on multiple public dialogue corpus, opening the door to all kinds of summarization models.

pdf bib
On Commonsense Cues in BERT for Solving Commonsense Tasks
Leyang Cui | Sijie Cheng | Yu Wu | Yue Zhang
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib
A Comparison between Pre-training and Large-scale Back-translation for Neural Machine Translation
Dandan Huang | Kun Wang | Yue Zhang
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib
Template-Based Named Entity Recognition Using BART
Leyang Cui | Yu Wu | Jian Liu | Sen Yang | Yue Zhang
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib
Generalized Supervised Attention for Text Generation
Yixian Liu | Liwen Zhang | Xinyu Zhang | Yong Jiang | Yue Zhang | Kewei Tu
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib
DialogSum: A Real-Life Scenario Dialogue Summarization Dataset
Yulong Chen | Yang Liu | Liang Chen | Yue Zhang
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib
What Did You Refer to? Evaluating Co-References in Dialogue
Wei-Nan Zhang | Yue Zhang | Hanlin Tang | Zhengyu Zhao | Caihai Zhu | Ting Liu
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2020

pdf bib
Making the Best Use of Review Summary for Sentiment Analysis
Sen Yang | Leyang Cui | Jun Xie | Yue Zhang
Proceedings of the 28th International Conference on Computational Linguistics

Sentiment analysis provides a useful overview of customer review contents. Many review websites allow a user to enter a summary in addition to a full review. Intuitively, summary information may give additional benefit for review sentiment analysis. In this paper, we conduct a study to exploit methods for better use of summary information. We start by finding out that the sentimental signal distribution of a review and that of its corresponding summary are in fact complementary to each other. We thus explore various architectures to better guide the interactions between the two and propose a hierarchically-refined review-centric attention model. Empirical results show that our review-centric model can make better use of user-written summaries for review sentiment analysis, and is also more effective compared to existing methods when the user summary is replaced with summary generated by an automatic summarization system.

pdf bib
Financial Sentiment Analysis: An Investigation into Common Mistakes and Silver Bullets
Frank Xing | Lorenzo Malandri | Yue Zhang | Erik Cambria
Proceedings of the 28th International Conference on Computational Linguistics

The recent dominance of machine learning-based natural language processing methods has fostered the culture of overemphasizing model accuracies rather than studying the reasons behind their errors. Interpretability, however, is a critical requirement for many downstream AI and NLP applications, e.g., in finance, healthcare, and autonomous driving. This study, instead of proposing any “new model”, investigates the error patterns of some widely acknowledged sentiment analysis methods in the finance domain. We discover that (1) those methods belonging to the same clusters are prone to similar error patterns, and (2) there are six types of linguistic features that are pervasive in the common errors. These findings provide important clues and practical considerations for improving sentiment analysis models for financial applications.

pdf bib
Sentiment Forecasting in Dialog
Zhongqing Wang | Xiujun Zhu | Yue Zhang | Shoushan Li | Guodong Zhou
Proceedings of the 28th International Conference on Computational Linguistics

Sentiment forecasting in dialog aims to predict the polarity of next utterance to come, and can help speakers revise their utterances in sentimental utterances generation. However, the polarity of next utterance is normally hard to predict, due to the lack of content of next utterance (yet to come). In this study, we propose a Neural Sentiment Forecasting (NSF) model to address inherent challenges. In particular, we employ a neural simulation model to simulate the next utterance based on the context (previous utterances encountered). Moreover, we employ a sequence influence model to learn both pair-wise and seq-wise influence. Empirical studies illustrate the importance of proposed sentiment forecasting task, and justify the effectiveness of our NSF model over several strong baselines.

pdf bib
Does Chinese BERT Encode Word Structure?
Yile Wang | Leyang Cui | Yue Zhang
Proceedings of the 28th International Conference on Computational Linguistics

Contextualized representations give significantly improved results for a wide range of NLP tasks. Much work has been dedicated to analyzing the features captured by representative models such as BERT. Existing work finds that syntactic, semantic and word sense knowledge are encoded in BERT. However, little work has investigated word features for character languages such as Chinese. We investigate Chinese BERT using both attention weight distribution statistics and probing tasks, finding that (1) word information is captured by BERT; (2) word-level features are mostly in the middle representation layers; (3) downstream tasks make different use of word features in BERT, with POS tagging and chunking relying the most on word features, and natural language inference relying the least on such features.

pdf bib
Semantic Role Labeling with Heterogeneous Syntactic Knowledge
Qingrong Xia | Rui Wang | Zhenghua Li | Yue Zhang | Min Zhang
Proceedings of the 28th International Conference on Computational Linguistics

Recently, due to the interplay between syntax and semantics, incorporating syntactic knowledge into neural semantic role labeling (SRL) has achieved much attention. Most of the previous syntax-aware SRL works focus on explicitly modeling homogeneous syntactic knowledge over tree outputs. In this work, we propose to encode heterogeneous syntactic knowledge for SRL from both explicit and implicit representations. First, we introduce graph convolutional networks to explicitly encode multiple heterogeneous dependency parse trees. Second, we extract the implicit syntactic representations from syntactic parser trained with heterogeneous treebanks. Finally, we inject the two types of heterogeneous syntax-aware representations into the base SRL model as extra inputs. We conduct experiments on two widely-used benchmark datasets, i.e., Chinese Proposition Bank 1.0 and English CoNLL-2005 dataset. Experimental results show that incorporating heterogeneous syntactic knowledge brings significant improvements over strong baselines. We further conduct detailed analysis to gain insights on the usefulness of heterogeneous (vs. homogeneous) syntactic knowledge and the effectiveness of our proposed approaches for modeling such knowledge.

pdf bib
Porous Lattice Transformer Encoder for Chinese NER
Xue Mengge | Bowen Yu | Tingwen Liu | Yue Zhang | Erli Meng | Bin Wang
Proceedings of the 28th International Conference on Computational Linguistics

Incorporating lexicons into character-level Chinese NER by lattices is proven effective to exploitrich word boundary information. Previous work has extended RNNs to consume lattice inputsand achieved great success. However, due to the DAG structure and the inherently unidirectionalsequential nature, this method precludes batched computation and sufficient semantic interaction.In this paper, we propose PLTE, an extension of transformer encoder that is tailored for ChineseNER, which models all the characters and matched lexical words in parallel with batch process-ing. PLTE augments self-attention with positional relation representations to incorporate latticestructure. It also introduces a porous mechanism to augment localness modeling and maintainthe strength of capturing the rich long-term dependencies. Experimental results show that PLTEperforms up to 11.4 times faster than state-of-the-art methods while realizing better performance.We also demonstrate that using BERT representations further substantially boosts the performanceand brings out the best in PLTE.

pdf bib
Layer-Wise Multi-View Learning for Neural Machine Translation
Qiang Wang | Changliang Li | Yue Zhang | Tong Xiao | Jingbo Zhu
Proceedings of the 28th International Conference on Computational Linguistics

Traditional neural machine translation is limited to the topmost encoder layer’s context representation and cannot directly perceive the lower encoder layers. Existing solutions usually rely on the adjustment of network architecture, making the calculation more complicated or introducing additional structural restrictions. In this work, we propose layer-wise multi-view learning to solve this problem, circumventing the necessity to change the model structure. We regard each encoder layer’s off-the-shelf output, a by-product in layer-by-layer encoding, as the redundant view for the input sentence. In this way, in addition to the topmost encoder layer (referred to as the primary view), we also incorporate an intermediate encoder layer as the auxiliary view. We feed the two views to a partially shared decoder to maintain independent prediction. Consistency regularization based on KL divergence is used to encourage the two views to learn from each other. Extensive experimental results on five translation tasks show that our approach yields stable improvements over multiple strong baselines. As another bonus, our method is agnostic to network architectures and can maintain the same inference speed as the original model.

pdf bib
Enhancing Generalization in Natural Language Inference by Syntax
Qi He | Han Wang | Yue Zhang
Findings of the Association for Computational Linguistics: EMNLP 2020

Pre-trained language models such as BERT have achieved the state-of-the-art performance on natural language inference (NLI). However, it has been shown that such models can be tricked by variations of surface patterns such as syntax. We investigate the use of dependency trees to enhance the generalization of BERT in the NLI task, leveraging on a graph convolutional network to represent a syntax-based matching graph with heterogeneous matching patterns. Experimental results show that, our syntax-based method largely enhance generalization of BERT on a test set where the sentence pair has high lexical overlap but diverse syntactic structures, and do not degrade performance on the standard test set. In other words, the proposed method makes BERT more robust on syntactic changes.

pdf bib
Modeling Global and Local Node Contexts for Text Generation from Knowledge Graphs
Leonardo F. R. Ribeiro | Yue Zhang | Claire Gardent | Iryna Gurevych
Transactions of the Association for Computational Linguistics, Volume 8

Recent graph-to-text models generate text from graph-based data using either global or local aggregation to learn node representations. Global node encoding allows explicit communication between two distant nodes, thereby neglecting graph topology as all nodes are directly connected. In contrast, local node encoding considers the relations between neighbor nodes capturing the graph structure, but it can fail to capture long-range relations. In this work, we gather both encoding strategies, proposing novel neural models that encode an input graph combining both global and local node contexts, in order to learn better contextualized node embeddings. In our experiments, we demonstrate that our approaches lead to significant improvements on two graph-to-text datasets achieving BLEU scores of 18.01 on the AGENDA dataset, and 63.69 on the WebNLG dataset for seen categories, outperforming state-of-the-art models by 3.7 and 3.1 points, respectively.1

pdf bib
Multiscale Collaborative Deep Models for Neural Machine Translation
Xiangpeng Wei | Heng Yu | Yue Hu | Yue Zhang | Rongxiang Weng | Weihua Luo
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Recent evidence reveals that Neural Machine Translation (NMT) models with deeper neural networks can be more effective but are difficult to train. In this paper, we present a MultiScale Collaborative (MSC) framework to ease the training of NMT models that are substantially deeper than those used previously. We explicitly boost the gradient back-propagation from top to bottom levels by introducing a block-scale collaboration mechanism into deep NMT models. Then, instead of forcing the whole encoder stack directly learns a desired representation, we let each encoder block learns a fine-grained representation and enhance it by encoding spatial dependencies using a context-scale collaboration. We provide empirical evidence showing that the MSC nets are easy to optimize and can obtain improvements of translation quality from considerably increased depth. On IWSLT translation tasks with three translation directions, our extremely deep models (with 72-layer encoders) surpass strong baselines by +2.2~+3.1 BLEU points. In addition, our deep MSC achieves a BLEU score of 30.56 on WMT14 English-to-German task that significantly outperforms state-of-the-art deep NMT models. We have included the source code in supplementary materials.

pdf bib
MuTual: A Dataset for Multi-Turn Dialogue Reasoning
Leyang Cui | Yu Wu | Shujie Liu | Yue Zhang | Ming Zhou
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Non-task oriented dialogue systems have achieved great success in recent years due to largely accessible conversation data and the development of deep learning techniques. Given a context, current systems are able to yield a relevant and fluent response, but sometimes make logical mistakes because of weak reasoning capabilities. To facilitate the conversation reasoning research, we introduce MuTual, a novel dataset for Multi-Turn dialogue Reasoning, consisting of 8,860 manually annotated dialogues based on Chinese student English listening comprehension exams. Compared to previous benchmarks for non-task oriented dialogue systems, MuTual is much more challenging since it requires a model that be able to handle various reasoning problems. Empirical results show that state-of-the-art methods only reach 71%, which is far behind human performance of 94%, indicating that there is ample room for improving reasoning ability.

pdf bib
Bilingual Dictionary Based Neural Machine Translation without Using Parallel Sentences
Xiangyu Duan | Baijun Ji | Hao Jia | Min Tan | Min Zhang | Boxing Chen | Weihua Luo | Yue Zhang
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

In this paper, we propose a new task of machine translation (MT), which is based on no parallel sentences but can refer to a ground-truth bilingual dictionary. Motivated by the ability of a monolingual speaker learning to translate via looking up the bilingual dictionary, we propose the task to see how much potential an MT system can attain using the bilingual dictionary and large scale monolingual corpora, while is independent on parallel sentences. We propose anchored training (AT) to tackle the task. AT uses the bilingual dictionary to establish anchoring points for closing the gap between source language and target language. Experiments on various language pairs show that our approaches are significantly better than various baselines, including dictionary-based word-by-word translation, dictionary-supervised cross-lingual word embedding transformation, and unsupervised MT. On distant language pairs that are hard for unsupervised MT to perform well, AT performs remarkably better, achieving performances comparable to supervised SMT trained on more than 4M parallel sentences.

pdf bib
Syntax-Aware Opinion Role Labeling with Dependency Graph Convolutional Networks
Bo Zhang | Yue Zhang | Rui Wang | Zhenghua Li | Min Zhang
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Opinion role labeling (ORL) is a fine-grained opinion analysis task and aims to answer “who expressed what kind of sentiment towards what?”. Due to the scarcity of labeled data, ORL remains challenging for data-driven methods. In this work, we try to enhance neural ORL models with syntactic knowledge by comparing and integrating different representations. We also propose dependency graph convolutional networks (DEPGCN) to encode parser information at different processing levels. In order to compensate for parser inaccuracy and reduce error propagation, we introduce multi-task learning (MTL) to train the parser and the ORL model simultaneously. We verify our methods on the benchmark MPQA corpus. The experimental results show that syntactic information is highly valuable for ORL, and our final MTL model effectively boosts the F1 score by 9.29 over the syntax-agnostic baseline. In addition, we find that the contributions from syntactic knowledge do not fully overlap with contextualized word representations (BERT). Our best model achieves 4.34 higher F1 score than the current state-ofthe-art.

pdf bib
AMR Parsing with Latent Structural Information
Qiji Zhou | Yue Zhang | Donghong Ji | Hao Tang
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Abstract Meaning Representations (AMRs) capture sentence-level semantics structural representations to broad-coverage natural sentences. We investigate parsing AMR with explicit dependency structures and interpretable latent structures. We generate the latent soft structure without additional annotations, and fuse both dependency and latent structure via an extended graph neural networks. The fused structural information helps our experiments results to achieve the best reported results on both AMR 2.0 (77.5% Smatch F1 on LDC2017T10) and AMR 1.0 ((71.8% Smatch F1 on LDC2014T12).

pdf bib
ZPR2: Joint Zero Pronoun Recovery and Resolution using Multi-Task Learning and BERT
Linfeng Song | Kun Xu | Yue Zhang | Jianshu Chen | Dong Yu
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Zero pronoun recovery and resolution aim at recovering the dropped pronoun and pointing out its anaphoric mentions, respectively. We propose to better explore their interaction by solving both tasks together, while the previous work treats them separately. For zero pronoun resolution, we study this task in a more realistic setting, where no parsing trees or only automatic trees are available, while most previous work assumes gold trees. Experiments on two benchmarks show that joint modeling significantly outperforms our baseline that already beats the previous state of the arts.

pdf bib
Multi-Cell Compositional LSTM for NER Domain Adaptation
Chen Jia | Yue Zhang
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Cross-domain NER is a challenging yet practical problem. Entity mentions can be highly different across domains. However, the correlations between entity types can be relatively more stable across domains. We investigate a multi-cell compositional LSTM structure for multi-task learning, modeling each entity type using a separate cell state. With the help of entity typed units, cross-domain knowledge transfer can be made in an entity type level. Theoretically, the resulting distinct feature distributions for each entity type make it more powerful for cross-domain transfer. Empirically, experiments on four few-shot and zero-shot datasets show our method significantly outperforms a series of multi-task learning methods and achieves the best results.

pdf bib
Dynamic Fusion Network for Multi-Domain End-to-end Task-Oriented Dialog
Libo Qin | Xiao Xu | Wanxiang Che | Yue Zhang | Ting Liu
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Recent studies have shown remarkable success in end-to-end task-oriented dialog system. However, most neural models rely on large training data, which are only available for a certain number of task domains, such as navigation and scheduling. This makes it difficult to scalable for a new domain with limited labeled data. However, there has been relatively little research on how to effectively use data from all domains to improve the performance of each domain and also unseen domains. To this end, we investigate methods that can make explicit use of domain knowledge and introduce a shared-private network to learn shared and specific knowledge. In addition, we propose a novel Dynamic Fusion Network (DF-Net) which automatically exploit the relevance between the target domain and each domain. Results show that our models outperforms existing methods on multi-domain dialogue, giving the state-of-the-art in the literature. Besides, with little training data, we show its transferability by outperforming prior best model by 13.9% on average.

pdf bib
Exploiting Syntactic Structure for Better Language Modeling: A Syntactic Distance Approach
Wenyu Du | Zhouhan Lin | Yikang Shen | Timothy J. O’Donnell | Yoshua Bengio | Yue Zhang
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

It is commonly believed that knowledge of syntactic structure should improve language modeling. However, effectively and computationally efficiently incorporating syntactic structure into neural language models has been a challenging topic. In this paper, we make use of a multi-task objective, i.e., the models simultaneously predict words as well as ground truth parse trees in a form called “syntactic distances”, where information between these two separate objectives shares the same intermediate representation. Experimental results on the Penn Treebank and Chinese Treebank datasets show that when ground truth parse trees are provided as additional training signals, the model is able to achieve lower perplexity and induce trees with better quality.

pdf bib
DRTS Parsing with Structure-Aware Encoding and Decoding
Qiankun Fu | Yue Zhang | Jiangming Liu | Meishan Zhang
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Discourse representation tree structure (DRTS) parsing is a novel semantic parsing task which has been concerned most recently. State-of-the-art performance can be achieved by a neural sequence-to-sequence model, treating the tree construction as an incremental sequence generation problem. Structural information such as input syntax and the intermediate skeleton of the partial output has been ignored in the model, which could be potentially useful for the DRTS parsing. In this work, we propose a structural-aware model at both the encoder and decoder phase to integrate the structural information, where graph attention network (GAT) is exploited for effectively modeling. Experimental results on a benchmark dataset show that our proposed model is effective and can obtain the best performance in the literature.

pdf bib
Structural Information Preserving for Graph-to-Text Generation
Linfeng Song | Ante Wang | Jinsong Su | Yue Zhang | Kun Xu | Yubin Ge | Dong Yu
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

The task of graph-to-text generation aims at producing sentences that preserve the meaning of input graphs. As a crucial defect, the current state-of-the-art models may mess up or even drop the core structural information of input graphs when generating outputs. We propose to tackle this problem by leveraging richer training signals that can guide our model for preserving input information. In particular, we introduce two types of autoencoding losses, each individually focusing on different aspects (a.k.a. views) of input graphs. The losses are then back-propagated to better calibrate our model via multi-task training. Experiments on two benchmarks for graph-to-text generation show the effectiveness of our approach over a state-of-the-art baseline.

pdf bib
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts
Agata Savary | Yue Zhang
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts

pdf bib
Investigating Rich Feature Sources for Conceptual Representation Encoding
Lu Cao | Yulong Chen | Dandan Huang | Yue Zhang
Proceedings of the Workshop on the Cognitive Aspects of the Lexicon

Functional Magnetic Resonance Imaging (fMRI) provides a means to investigate human conceptual representation in cognitive and neuroscience studies, where researchers predict the fMRI activations with elicited stimuli inputs. Previous work mainly uses a single source of features, particularly linguistic features, to predict fMRI activations. However, relatively little work has been done on investigating rich-source features for conceptual representation. In this paper, we systematically compare the linguistic, visual as well as auditory input features in conceptual representation, and further introduce associative conceptual features, which are obtained from Small World of Words game, to predict fMRI activations. Our experimental results show that those rich-source features can enhance performance in predicting the fMRI activations. Our analysis indicates that information from rich sources is present in the conceptual representation of human brains. In particular, the visual feature weights the most on conceptual representation, which is consistent with the recent cognitive science study.

pdf bib
What Have We Achieved on Text Summarization?
Dandan Huang | Leyang Cui | Sen Yang | Guangsheng Bao | Kun Wang | Jun Xie | Yue Zhang
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Deep learning has led to significant improvement in text summarization with various methods investigated and improved ROUGE scores reported over the years. However, gaps still exist between summaries produced by automatic summarizers and human professionals. Aiming to gain more understanding of summarization systems with respect to their strengths and limits on a fine-grained syntactic and semantic level, we consult the Multidimensional Quality Metric (MQM) and quantify 8 major sources of errors on 10 representative summarization models manually. Primarily, we find that 1) under similar settings, extractive summarizers are in general better than their abstractive counterparts thanks to strength in faithfulness and factual-consistency; 2) milestone techniques such as copy, coverage and hybrid extractive/abstractive methods do bring specific improvements but also demonstrate limitations; 3) pre-training techniques, and in particular sequence-to-sequence pre-training, are highly effective for improving text summarization, with BART giving the best results.

pdf bib
Online Back-Parsing for AMR-to-Text Generation
Xuefeng Bai | Linfeng Song | Yue Zhang
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

AMR-to-text generation aims to recover a text containing the same meaning as an input AMR graph. Current research develops increasingly powerful graph encoders to better represent AMR graphs, with decoders based on standard language modeling being used to generate outputs. We propose a decoder that back predicts projected AMR graphs on the target sentence during text generation. As the result, our outputs can better preserve the input meaning than standard decoders. Experiments on two AMR benchmarks show the superiority of our model over the previous state-of-the-art system based on graph Transformer.

pdf bib
Inducing Target-Specific Latent Structures for Aspect Sentiment Classification
Chenhua Chen | Zhiyang Teng | Yue Zhang
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Aspect-level sentiment analysis aims to recognize the sentiment polarity of an aspect or a target in a comment. Recently, graph convolutional networks based on linguistic dependency trees have been studied for this task. However, the dependency parsing accuracy of commercial product comments or tweets might be unsatisfactory. To tackle this problem, we associate linguistic dependency trees with automatically induced aspectspecific graphs. We propose gating mechanisms to dynamically combine information from word dependency graphs and latent graphs which are learned by self-attention networks. Our model can complement supervised syntactic features with latent semantic dependencies. Experimental results on five benchmarks show the effectiveness of our proposed latent models, giving significantly better results than models without using latent graphs.

pdf bib
Coarse-to-Fine Pre-training for Named Entity Recognition
Xue Mengge | Bowen Yu | Zhenyu Zhang | Tingwen Liu | Yue Zhang | Bin Wang
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

More recently, Named Entity Recognition hasachieved great advances aided by pre-trainingapproaches such as BERT. However, currentpre-training techniques focus on building lan-guage modeling objectives to learn a gen-eral representation, ignoring the named entity-related knowledge. To this end, we proposea NER-specific pre-training framework to in-ject coarse-to-fine automatically mined entityknowledge into pre-trained models. Specifi-cally, we first warm-up the model via an en-tity span identification task by training it withWikipedia anchors, which can be deemed asgeneral-typed entities. Then we leverage thegazetteer-based distant supervision strategy totrain the model extract coarse-grained typedentities. Finally, we devise a self-supervisedauxiliary task to mine the fine-grained namedentity knowledge via clustering.Empiricalstudies on three public NER datasets demon-strate that our framework achieves significantimprovements against several pre-trained base-lines, establishing the new state-of-the-art per-formance on three benchmarks. Besides, weshow that our framework gains promising re-sults without using human-labeled trainingdata, demonstrating its effectiveness in label-few and low-resource scenarios.

pdf bib
Entity Enhanced BERT Pre-training for Chinese NER
Chen Jia | Yuefeng Shi | Qinrong Yang | Yue Zhang
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Character-level BERT pre-trained in Chinese suffers a limitation of lacking lexicon information, which shows effectiveness for Chinese NER. To integrate the lexicon into pre-trained LMs for Chinese NER, we investigate a semi-supervised entity enhanced BERT pre-training method. In particular, we first extract an entity lexicon from the relevant raw text using a new-word discovery method. We then integrate the entity information into BERT using Char-Entity-Transformer, which augments the self-attention using a combination of character and entity representations. In addition, an entity classification task helps inject the entity information into model parameters in pre-training. The pre-trained models are used for NER fine-tuning. Experiments on a news dataset and two datasets annotated by ourselves for NER in long-text show that our method is highly effective and achieves the best results.

pdf bib
SemEval-2020 Task 4: Commonsense Validation and Explanation
Cunxiang Wang | Shuailong Liang | Yili Jin | Yilong Wang | Xiaodan Zhu | Yue Zhang
Proceedings of the Fourteenth Workshop on Semantic Evaluation

In this paper, we present SemEval-2020 Task 4, Commonsense Validation and Explanation (ComVE), which includes three subtasks, aiming to evaluate whether a system can distinguish a natural language statement that makes sense to humans from one that does not, and provide the reasons. Specifically, in our first subtask, the participating systems are required to choose from two natural language statements of similar wording the one that makes sense and the one does not. The second subtask additionally asks a system to select the key reason from three options why a given statement does not make sense. In the third subtask, a participating system needs to generate the reason automatically. 39 teams submitted their valid systems to at least one subtask. For Subtask A and Subtask B, top-performing teams have achieved results closed to human performance. However, for Subtask C, there is still a considerable gap between system and human performance. The dataset used in our task can be found at https://github.com/wangcunxiang/SemEval2020-Task4-Commonsense-Validation-and-Explanation.

pdf bib
Proceedings of the 19th Chinese National Conference on Computational Linguistics
Maosong Sun (孙茂松) | Sujian Li (李素建) | Yue Zhang (张岳) | Yang Liu (刘洋)
Proceedings of the 19th Chinese National Conference on Computational Linguistics

pdf bib
新型冠状病毒肺炎相关的推特主题与情感研究(Exploring COVID-19-related Twitter Topic Dynamics across Countries)
Shuailong Liang (梁帅龙) | Derek F. Wong (黄辉) | Yue Zhang (张岳)
Proceedings of the 19th Chinese National Conference on Computational Linguistics

我们基于从2020年1月22日至2020年4月30日在推特社交平台上抓取的不同国家和地区发布的50万条推文,研究了有关 2019新型冠状病毒肺炎相关的主题和人们的观点,发现了不同国家之间推特用户的普遍关切和看法之间存在着异同,并且对不同议题的情感态度也有所不同。我们发现大部分推文中包含了强烈的情感,其中表达爱与支持的推文比较普遍。总体来看,人们的情感随着时间的推移逐渐正向增强。

pdf bib
Cross-Lingual Dependency Parsing via Self-Training
Meishan Zhang | Yue Zhang
Proceedings of the 19th Chinese National Conference on Computational Linguistics

Recent advances of multilingual word representations weaken the input divergences across languages, making cross-lingual transfer similar to the monolingual cross-domain and semi-supervised settings. Thus self-training, which is effective for these settings, could be possibly beneficial to cross-lingual as well. This paper presents the first comprehensive study for self-training in cross-lingual dependency parsing. Three instance selection strategies are investigated, where two of which are based on the baseline dependency parsing model, and the third one adopts an auxiliary cross-lingual POS tagging model as evidence. We conduct experiments on the universal dependencies for eleven languages. Results show that self-training can boost the dependency parsing performances on the target languages. In addition, the POS tagger assistant instance selection can achieve further improvements consistently. Detailed analysis is conducted to examine the potentiality of self-training in-depth.

pdf bib
Covidex: Neural Ranking Models and Keyword Search Infrastructure for the COVID-19 Open Research Dataset
Edwin Zhang | Nikhil Gupta | Raphael Tang | Xiao Han | Ronak Pradeep | Kuang Lu | Yue Zhang | Rodrigo Nogueira | Kyunghyun Cho | Hui Fang | Jimmy Lin
Proceedings of the First Workshop on Scholarly Document Processing

We present Covidex, a search engine that exploits the latest neural ranking models to provide information access to the COVID-19 Open Research Dataset curated by the Allen Institute for AI. Our system has been online and serving users since late March 2020. The Covidex is the user application component of our three-pronged strategy to develop technologies for helping domain experts tackle the ongoing global pandemic. In addition, we provide robust and easy-to-use keyword search infrastructure that exploits mature fusion-based methods as well as standalone neural ranking models that can be incorporated into other applications. These techniques have been evaluated in the multi-round TREC-COVID challenge: Our infrastructure and baselines have been adopted by many participants, including some of the best systems. In round 3, we submitted the highest-scoring run that took advantage of previous training data and the second-highest fully automatic run. In rounds 4 and 5, we submitted the highest-scoring fully automatic runs.

2019

pdf bib
Leveraging Dependency Forest for Neural Medical Relation Extraction
Linfeng Song | Yue Zhang | Daniel Gildea | Mo Yu | Zhiguo Wang | Jinsong Su
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Medical relation extraction discovers relations between entity mentions in text, such as research articles. For this task, dependency syntax has been recognized as a crucial source of features. Yet in the medical domain, 1-best parse trees suffer from relatively low accuracies, diminishing their usefulness. We investigate a method to alleviate this problem by utilizing dependency forests. Forests contain more than one possible decisions and therefore have higher recall but more noise compared with 1-best outputs. A graph neural network is used to represent the forests, automatically distinguishing the useful syntactic information from parsing noise. Results on two benchmarks show that our method outperforms the standard tree-based methods, giving the state-of-the-art results in the literature.

pdf bib
Syntax-Enhanced Self-Attention-Based Semantic Role Labeling
Yue Zhang | Rui Wang | Luo Si
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

As a fundamental NLP task, semantic role labeling (SRL) aims to discover the semantic roles for each predicate within one sentence. This paper investigates how to incorporate syntactic knowledge into the SRL task effectively. We present different approaches of en- coding the syntactic information derived from dependency trees of different quality and representations; we propose a syntax-enhanced self-attention model and compare it with other two strong baseline methods; and we con- duct experiments with newly published deep contextualized word representations as well. The experiment results demonstrate that with proper incorporation of the high quality syntactic information, our model achieves a new state-of-the-art performance for the Chinese SRL task on the CoNLL-2009 dataset.

pdf bib
Cross-Lingual Dependency Parsing Using Code-Mixed TreeBank
Meishan Zhang | Yue Zhang | Guohong Fu
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Treebank translation is a promising method for cross-lingual transfer of syntactic dependency knowledge. The basic idea is to map dependency arcs from a source treebank to its target translation according to word alignments. This method, however, can suffer from imperfect alignment between source and target words. To address this problem, we investigate syntactic transfer by code mixing, translating only confident words in a source treebank. Cross-lingual word embeddings are leveraged for transferring syntactic knowledge to the target from the resulting code-mixed treebank. Experiments on University Dependency Treebanks show that code-mixed treebanks are more effective than translated treebanks, giving highly competitive performances among cross-lingual parsing methods.

pdf bib
Contrastive Attention Mechanism for Abstractive Sentence Summarization
Xiangyu Duan | Hongfei Yu | Mingming Yin | Min Zhang | Weihua Luo | Yue Zhang
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

We propose a contrastive attention mechanism to extend the sequence-to-sequence framework for abstractive sentence summarization task, which aims to generate a brief summary of a given source sentence. The proposed contrastive attention mechanism accommodates two categories of attention: one is the conventional attention that attends to relevant parts of the source sentence, the other is the opponent attention that attends to irrelevant or less relevant parts of the source sentence. Both attentions are trained in an opposite way so that the contribution from the conventional attention is encouraged and the contribution from the opponent attention is discouraged through a novel softmax and softmin functionality. Experiments on benchmark datasets show that, the proposed contrastive attention mechanism is more focused on the relevant parts for the summary than the conventional attention mechanism, and greatly advances the state-of-the-art performance on the abstractive sentence summarization task. We release the code at https://github.com/travel-go/ Abstractive-Text-Summarization.

pdf bib
A Pilot Study for Chinese SQL Semantic Parsing
Qingkai Min | Yuefeng Shi | Yue Zhang
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

The task of semantic parsing is highly useful for dialogue and question answering systems. Many datasets have been proposed to map natural language text into SQL, among which the recent Spider dataset provides cross-domain samples with multiple tables and complex queries. We build a Spider dataset for Chinese, which is currently a low-resource language in this task area. Interesting research questions arise from the uniqueness of the language, which requires word segmentation, and also from the fact that SQL keywords and columns of DB tables are typically written in English. We compare character- and word-based encoders for a semantic parser, and different embedding schemes. Results show that word-based semantic parser is subject to segmentation errors and cross-lingual word embeddings are useful for text-to-SQL.

pdf bib
Hierarchically-Refined Label Attention Network for Sequence Labeling
Leyang Cui | Yue Zhang
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

CRF has been used as a powerful model for statistical sequence labeling. For neural sequence labeling, however, BiLSTM-CRF does not always lead to better results compared with BiLSTM-softmax local classification. This can be because the simple Markov label transition model of CRF does not give much information gain over strong neural encoding. For better representing label sequences, we investigate a hierarchically-refined label attention network, which explicitly leverages label embeddings and captures potential long-term label dependency by giving each word incrementally refined label distributions with hierarchical attention. Results on POS tagging, NER and CCG supertagging show that the proposed model not only improves the overall tagging accuracy with similar number of parameters, but also significantly speeds up the training and testing compared to BiLSTM-CRF.

pdf bib
Code-Switching for Enhancing NMT with Pre-Specified Translation
Kai Song | Yue Zhang | Heng Yu | Weihua Luo | Kun Wang | Min Zhang
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Leveraging user-provided translation to constrain NMT has practical significance. Existing methods can be classified into two main categories, namely the use of placeholder tags for lexicon words and the use of hard constraints during decoding. Both methods can hurt translation fidelity for various reasons. We investigate a data augmentation method, making code-switched training data by replacing source phrases with their target translations. Our method does not change the MNT model or decoding algorithm, allowing the model to learn lexicon translations by copying source-side target words. Extensive experiments show that our method achieves consistent improvements over existing approaches, improving translation of constrained words without hurting unconstrained words.

pdf bib
Subword Encoding in Lattice LSTM for Chinese Word Segmentation
Jie Yang | Yue Zhang | Shuailong Liang
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

We investigate subword information for Chinese word segmentation, by integrating sub word embeddings trained using byte-pair encoding into a Lattice LSTM (LaLSTM) network over a character sequence. Experiments on standard benchmark show that subword information brings significant gains over strong character-based segmentation models. To our knowledge, this is the first research on the effectiveness of subwords on neural word segmentation.

pdf bib
Improving Cross-Domain Chinese Word Segmentation with Word Embeddings
Yuxiao Ye | Weikang Li | Yue Zhang | Likun Qiu | Jian Sun
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Cross-domain Chinese Word Segmentation (CWS) remains a challenge despite recent progress in neural-based CWS. The limited amount of annotated data in the target domain has been the key obstacle to a satisfactory performance. In this paper, we propose a semi-supervised word-based approach to improving cross-domain CWS given a baseline segmenter. Particularly, our model only deploys word embeddings trained on raw text in the target domain, discarding complex hand-crafted features and domain-specific dictionaries. Innovative subsampling and negative sampling methods are proposed to derive word embeddings optimized for CWS. We conduct experiments on five datasets in special domains, covering domains in novels, medicine, and patent. Results show that our model can obviously improve cross-domain CWS, especially in the segmentation of domain-specific noun entities. The word F-measure increases by over 3.0% on four datasets, outperforming state-of-the-art semi-supervised and unsupervised cross-domain CWS approaches with a large margin. We make our data and code available on Github.

pdf bib
SUDA-Alibaba at MRP 2019: Graph-Based Models with BERT
Yue Zhang | Wei Jiang | Qingrong Xia | Junjie Cao | Rui Wang | Zhenghua Li | Min Zhang
Proceedings of the Shared Task on Cross-Framework Meaning Representation Parsing at the 2019 Conference on Natural Language Learning

In this paper, we describe our participating systems in the shared task on Cross- Framework Meaning Representation Parsing (MRP) at the 2019 Conference for Computational Language Learning (CoNLL). The task includes five frameworks for graph-based meaning representations, i.e., DM, PSD, EDS, UCCA, and AMR. One common characteristic of our systems is that we employ graph-based methods instead of transition-based methods when predicting edges between nodes. For SDP, we jointly perform edge prediction, frame tagging, and POS tagging via multi-task learning (MTL). For UCCA, we also jointly model a constituent tree parsing and a remote edge recovery task. For both EDS and AMR, we produce nodes first and edges second in a pipeline fashion. External resources like BERT are found helpful for all frameworks except AMR. Our final submission ranks the third on the overall MRP evaluation metric, the first on EDS and the second on UCCA.

pdf bib
Multi-Granular Text Encoding for Self-Explaining Categorization
Zhiguo Wang | Yue Zhang | Mo Yu | Wei Zhang | Lin Pan | Linfeng Song | Kun Xu | Yousef El-Kurdi
Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP

Self-explaining text categorization requires a classifier to make a prediction along with supporting evidence. A popular type of evidence is sub-sequences extracted from the input text which are sufficient for the classifier to make the prediction. In this work, we define multi-granular ngrams as basic units for explanation, and organize all ngrams into a hierarchical structure, so that shorter ngrams can be reused while computing longer ngrams. We leverage the tree-structured LSTM to learn a context-independent representation for each unit via parameter sharing. Experiments on medical disease classification show that our model is more accurate, efficient and compact than the BiLSTM and CNN baselines. More importantly, our model can extract intuitive multi-granular evidence to support its predictions.

pdf bib
Cross-Domain NER using Cross-Domain Language Modeling
Chen Jia | Xiaobo Liang | Yue Zhang
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Due to limitation of labeled resources, cross-domain named entity recognition (NER) has been a challenging task. Most existing work considers a supervised setting, making use of labeled data for both the source and target domains. A disadvantage of such methods is that they cannot train for domains without NER data. To address this issue, we consider using cross-domain LM as a bridge cross-domains for NER domain adaptation, performing cross-domain and cross-task knowledge transfer by designing a novel parameter generation network. Results show that our method can effectively extract domain differences from cross-domain LM contrast, allowing unsupervised domain adaptation while also giving state-of-the-art results among supervised domain adaptation methods.

pdf bib
Open Domain Event Extraction Using Neural Latent Variable Models
Xiao Liu | Heyan Huang | Yue Zhang
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

We consider open domain event extraction, the task of extracting unconstraint types of events from news clusters. A novel latent variable neural model is constructed, which is scalable to very large corpus. A dataset is collected and manually annotated, with task-specific evaluation metrics being designed. Results show that the proposed unsupervised model gives better performance compared to the state-of-the-art method for event schema induction.

pdf bib
Leveraging Local and Global Patterns for Self-Attention Networks
Mingzhou Xu | Derek F. Wong | Baosong Yang | Yue Zhang | Lidia S. Chao
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Self-attention networks have received increasing research attention. By default, the hidden states of each word are hierarchically calculated by attending to all words in the sentence, which assembles global information. However, several studies pointed out that taking all signals into account may lead to overlooking neighboring information (e.g. phrase pattern). To address this argument, we propose a hybrid attention mechanism to dynamically leverage both of the local and global information. Specifically, our approach uses a gating scalar for integrating both sources of the information, which is also convenient for quantifying their contributions. Experiments on various neural machine translation tasks demonstrate the effectiveness of the proposed method. The extensive analyses verify that the two types of contexts are complementary to each other, and our method gives highly effective improvements in their integration.

pdf bib
Tree Communication Models for Sentiment Analysis
Yuan Zhang | Yue Zhang
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Tree-LSTMs have been used for tree-based sentiment analysis over Stanford Sentiment Treebank, which allows the sentiment signals over hierarchical phrase structures to be calculated simultaneously. However, traditional tree-LSTMs capture only the bottom-up dependencies between constituents. In this paper, we propose a tree communication model using graph convolutional neural network and graph recurrent neural network, which allows rich information exchange between phrases constituent tree. Experiments show that our model outperforms existing work on bidirectional tree-LSTMs in both accuracy and efficiency, providing more consistent predictions on phrase-level sentiments.

pdf bib
Does it Make Sense? And Why? A Pilot Study for Sense Making and Explanation
Cunxiang Wang | Shuailong Liang | Yue Zhang | Xiaonan Li | Tian Gao
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Introducing common sense to natural language understanding systems has received increasing research attention. It remains a fundamental question on how to evaluate whether a system has the sense-making capability. Existing benchmarks measure common sense knowledge indirectly or without reasoning. In this paper, we release a benchmark to directly test whether a system can differentiate natural language statements that make sense from those that do not make sense. In addition, a system is asked to identify the most crucial reason why a statement does not make sense. We evaluate models trained over large-scale language modeling tasks as well as human performance, showing that there are different challenges for system sense-making.

pdf bib
Latent Variable Sentiment Grammar
Liwen Zhang | Kewei Tu | Yue Zhang
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Neural models have been investigated for sentiment classification over constituent trees. They learn phrase composition automatically by encoding tree structures but do not explicitly model sentiment composition, which requires to encode sentiment class labels. To this end, we investigate two formalisms with deep sentiment representations that capture sentiment subtype expressions by latent variables and Gaussian mixture vectors, respectively. Experiments on Stanford Sentiment Treebank (SST) show the effectiveness of sentiment grammar over vanilla neural encoders. Using ELMo embeddings, our method gives the best results on this benchmark.

pdf bib
Semantic Neural Machine Translation Using AMR
Linfeng Song | Daniel Gildea | Yue Zhang | Zhiguo Wang | Jinsong Su
Transactions of the Association for Computational Linguistics, Volume 7

It is intuitive that semantic representations can be useful for machine translation, mainly because they can help in enforcing meaning preservation and handling data sparsity (many sentences correspond to one meaning) of machine translation models. On the other hand, little work has been done on leveraging semantics for neural machine translation (NMT). In this work, we study the usefulness of AMR (abstract meaning representation) on NMT. Experiments on a standard English-to-German dataset show that incorporating AMR as additional knowledge can significantly improve a strong attention-based sequence-to-sequence neural translation model.

2018

pdf bib
Two Local Models for Neural Constituent Parsing
Zhiyang Teng | Yue Zhang
Proceedings of the 27th International Conference on Computational Linguistics

Non-local features have been exploited by syntactic parsers for capturing dependencies between sub output structures. Such features have been a key to the success of state-of-the-art statistical parsers. With the rise of deep learning, however, it has been shown that local output decisions can give highly competitive accuracies, thanks to the power of dense neural input representations that embody global syntactic information. We investigate two conceptually simple local neural models for constituent parsing, which make local decisions to constituent spans and CFG rules, respectively. Consistent with previous findings along the line, our best model gives highly competitive results, achieving the labeled bracketing F1 scores of 92.4% on PTB and 87.3% on CTB 5.1.

pdf bib
Learning Target-Specific Representations of Financial News Documents For Cumulative Abnormal Return Prediction
Junwen Duan | Yue Zhang | Xiao Ding | Ching-Yun Chang | Ting Liu
Proceedings of the 27th International Conference on Computational Linguistics

Texts from the Internet serve as important data sources for financial market modeling. Early statistical approaches rely on manually defined features to capture lexical, sentiment and event information, which suffers from feature sparsity. Recent work has considered learning dense representations for news titles and abstracts. Compared to news titles, full documents can contain more potentially helpful information, but also noise compared to events and sentences, which has been less investigated in previous work. To fill this gap, we propose a novel target-specific abstract-guided news document representation model. The model uses a target-sensitive representation of the news abstract to weigh sentences in the news content, so as to select and combine the most informative sentences for market modeling. Results show that document representations can give better performance for estimating cumulative abnormal returns of companies when compared to titles and abstracts. Our model is especially effective when it used to combine information from multiple document sources compared to the sentence-level baselines.

pdf bib
Design Challenges and Misconceptions in Neural Sequence Labeling
Jie Yang | Shuailong Liang | Yue Zhang
Proceedings of the 27th International Conference on Computational Linguistics

We investigate the design challenges of constructing effective and efficient neural sequence labeling systems, by reproducing twelve neural sequence labeling models, which include most of the state-of-the-art structures, and conduct a systematic model comparison on three benchmarks (i.e. NER, Chunking, and POS tagging). Misconceptions and inconsistent conclusions in existing literature are examined and clarified under statistical experiments. In the comparison and analysis process, we reach several practical conclusions which can be useful to practitioners.

pdf bib
Learning Domain Representation for Multi-Domain Sentiment Classification
Qi Liu | Yue Zhang | Jiangming Liu
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

Training data for sentiment analysis are abundant in multiple domains, yet scarce for other domains. It is useful to leveraging data available for all existing domains to enhance performance on different domains. We investigate this problem by learning domain-specific representations of input sentences using neural network. In particular, a descriptor vector is learned for representing each domain, which is used to map adversarially trained domain-general Bi-LSTM input representations into domain-specific representations. Based on this model, we further expand the input representation with exemplary domain knowledge, collected by attending over a memory network of domain training data. Results show that our model outperforms existing methods on multi-domain sentiment analysis significantly, giving the best accuracies on two different benchmarks.

pdf bib
Mining Evidences for Concept Stock Recommendation
Qi Liu | Yue Zhang
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

We investigate the task of mining relevant stocks given a topic of concern on emerging capital markets, for which there is lack of structural understanding. Deep learning is leveraged to mine evidences from large scale textual data, which contain valuable market information. In particular, distributed word similarities trained over large scale raw texts are taken as a basis of relevance measuring, and deep reinforcement learning is leveraged to learn a strategy of topic expansion, given a small amount of manually labeled data from financial analysts. Results on two Chinese stock market datasets show that our method outperforms a strong baseline using information retrieval techniques.

pdf bib
Leveraging Context Information for Natural Question Generation
Linfeng Song | Zhiguo Wang | Wael Hamza | Yue Zhang | Daniel Gildea
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)

The task of natural question generation is to generate a corresponding question given the input passage (fact) and answer. It is useful for enlarging the training set of QA systems. Previous work has adopted sequence-to-sequence models that take a passage with an additional bit to indicate answer position as input. However, they do not explicitly model the information between answer and other context within the passage. We propose a model that matches the answer with the passage before generating the question. Experiments show that our model outperforms the existing state of the art using rich features.

pdf bib
Cross-lingual Terminology Extraction for Translation Quality Estimation
Yu Yuan | Yuze Gao | Yue Zhang | Serge Sharoff
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
Sentence-State LSTM for Text Representation
Yue Zhang | Qi Liu | Linfeng Song
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Bi-directional LSTMs are a powerful tool for text representation. On the other hand, they have been shown to suffer various limitations due to their sequential nature. We investigate an alternative LSTM structure for encoding text, which consists of a parallel state for each word. Recurrent steps are used to perform local and global information exchange between words simultaneously, rather than incremental reading of a sequence of words. Results on various classification and sequence labelling benchmarks show that the proposed model has strong representation power, giving highly competitive performances compared to stacked BiLSTM models with similar parameter numbers.

pdf bib
Chinese NER Using Lattice LSTM
Yue Zhang | Jie Yang
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We investigate a lattice-structured LSTM model for Chinese NER, which encodes a sequence of input characters as well as all potential words that match a lexicon. Compared with character-based methods, our model explicitly leverages word and word sequence information. Compared with word-based methods, lattice LSTM does not suffer from segmentation errors. Gated recurrent cells allow our model to choose the most relevant characters and words from a sentence for better NER results. Experiments on various datasets show that lattice LSTM outperforms both word-based and character-based LSTM baselines, achieving the best results.

pdf bib
A Graph-to-Sequence Model for AMR-to-Text Generation
Linfeng Song | Yue Zhang | Zhiguo Wang | Daniel Gildea
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

The problem of AMR-to-text generation is to recover a text representing the same meaning as an input AMR graph. The current state-of-the-art method uses a sequence-to-sequence model, leveraging LSTM for encoding a linearized AMR structure. Although being able to model non-local semantic information, a sequence LSTM can lose information from the AMR graph structure, and thus facing challenges with large-graphs, which result in long sequences. We introduce a neural graph-to-sequence model, using a novel LSTM structure for directly encoding graph-level semantics. On a standard benchmark, our model shows superior results to existing methods in the literature.

pdf bib
A Rank-Based Similarity Metric for Word Embeddings
Enrico Santus | Hongmin Wang | Emmanuele Chersoni | Yue Zhang
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Word Embeddings have recently imposed themselves as a standard for representing word meaning in NLP. Semantic similarity between word pairs has become the most common evaluation benchmark for these representations, with vector cosine being typically used as the only similarity metric. In this paper, we report experiments with a rank-based metric for WE, which performs comparably to vector cosine in similarity estimation and outperforms it in the recently-introduced and challenging task of outlier detection, thus suggesting that rank-based measures can improve clustering quality.

pdf bib
YEDDA: A Lightweight Collaborative Text Span Annotation Tool
Jie Yang | Yue Zhang | Linwei Li | Xingxuan Li
Proceedings of ACL 2018, System Demonstrations

In this paper, we introduce Yedda, a lightweight but efficient and comprehensive open-source tool for text span annotation. Yedda provides a systematic solution for text span annotation, ranging from collaborative user annotation to administrator evaluation and analysis. It overcomes the low efficiency of traditional text annotation tools by annotating entities through both command line and shortcut keys, which are configurable with custom labels. Yedda also gives intelligent recommendations by learning the up-to-date annotated text. An administrator client is developed to evaluate annotation quality of multiple annotators and generate detailed comparison report for each annotator pair. Experiments show that the proposed system can reduce the annotation time by half compared with existing annotation tools. And the annotation time can be further compressed by 16.47% through intelligent recommendation.

pdf bib
NCRF++: An Open-source Neural Sequence Labeling Toolkit
Jie Yang | Yue Zhang
Proceedings of ACL 2018, System Demonstrations

This paper describes NCRF++, a toolkit for neural sequence labeling. NCRF++ is designed for quick implementation of different neural sequence labeling models with a CRF inference layer. It provides users with an inference for building the custom model structure through configuration file with flexible neural feature design and utilization. Built on PyTorch http://pytorch.org/, the core operations are calculated in batch, making the toolkit efficient with the acceleration of GPU. It also includes the implementations of most state-of-the-art neural sequence labeling models such as LSTM-CRF, facilitating reproducing and refinement on those methods.

pdf bib
Neural Transition-based Syntactic Linearization
Linfeng Song | Yue Zhang | Daniel Gildea
Proceedings of the 11th International Conference on Natural Language Generation

The task of linearization is to find a grammatical order given a set of words. Traditional models use statistical methods. Syntactic linearization systems, which generate a sentence along with its syntactic tree, have shown state-of-the-art performance. Recent work shows that a multilayer LSTM language model outperforms competitive statistical syntactic linearization systems without using syntax. In this paper, we study neural syntactic linearization, building a transition-based syntactic linearizer leveraging a feed forward neural network, observing significantly better results compared to LSTM language models on this task.

pdf bib
N-ary Relation Extraction using Graph-State LSTM
Linfeng Song | Yue Zhang | Zhiguo Wang | Daniel Gildea
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Cross-sentence n-ary relation extraction detects relations among n entities across multiple sentences. Typical methods formulate an input as a document graph, integrating various intra-sentential and inter-sentential dependencies. The current state-of-the-art method splits the input graph into two DAGs, adopting a DAG-structured LSTM for each. Though being able to model rich linguistic knowledge by leveraging graph edges, important information can be lost in the splitting procedure. We propose a graph-state LSTM model, which uses a parallel state to model each word, recurrently enriching state values via message passing. Compared with DAG LSTMs, our graph LSTM keeps the original graph structure, and speeds up computation by allowing more parallelization. On a standard benchmark, our model shows the best result in the literature.

bib
Joint models for NLP
Yue Zhang
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts

Joint models have received much research attention in NLP, allowing relevant tasks to share common information while avoiding error propagation in multi-stage pepelines. Several main approaches have been taken by statistical joint modeling, while neural models allow parameter sharing and adversarial training. This tutorial reviews main approaches to joint modeling for both statistical and neural methods.

2017

pdf bib
Proceedings of the 9th SIGHAN Workshop on Chinese Language Processing
Yue Zhang | Zhifang Sui
Proceedings of the 9th SIGHAN Workshop on Chinese Language Processing

pdf bib
Encoder-Decoder Shift-Reduce Syntactic Parsing
Jiangming Liu | Yue Zhang
Proceedings of the 15th International Conference on Parsing Technologies

Encoder-decoder neural networks have been used for many NLP tasks, such as neural machine translation. They have also been applied to constituent parsing by using bracketed tree structures as a target language, translating input sentences into syntactic trees. A more commonly used method to linearize syntactic trees is the shift-reduce system, which uses a sequence of transition-actions to build trees. We empirically investigate the effectiveness of applying the encoder-decoder network to transition-based parsing. On standard benchmarks, our system gives comparable results to the stack LSTM parser for dependency parsing, and significantly better results compared to the aforementioned parser for constituent parsing, which uses bracketed tree formats.

pdf bib
Attention-based Recurrent Convolutional Neural Network for Automatic Essay Scoring
Fei Dong | Yue Zhang | Jie Yang
Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017)

Neural network models have recently been applied to the task of automatic essay scoring, giving promising results. Existing work used recurrent neural networks and convolutional neural networks to model input essays, giving grades based on a single vector representation of the essay. On the other hand, the relative advantages of RNNs and CNNs have not been compared. In addition, different parts of the essay can contribute differently for scoring, which is not captured by existing models. We address these issues by building a hierarchical sentence-document model to represent essays, using the attention mechanism to automatically decide the relative weights of words and sentences. Results show that our model outperforms the previous state-of-the-art methods, demonstrating the effectiveness of the attention mechanism.

pdf bib
Neural Word Segmentation with Rich Pretraining
Jie Yang | Yue Zhang | Fei Dong
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Neural word segmentation research has benefited from large-scale raw texts by leveraging them for pretraining character and word embeddings. On the other hand, statistical segmentation research has exploited richer sources of external information, such as punctuation, automatic segmentation and POS. We investigate the effectiveness of a range of external training sources for neural word segmentation by building a modular segmentation model, pretraining the most important submodule using rich external sources. Results show that such pretraining significantly improves the model, leading to accuracies competitive to the best methods on six benchmarks.

pdf bib
Universal Dependencies Parsing for Colloquial Singaporean English
Hongmin Wang | Yue Zhang | GuangYong Leonard Chan | Jie Yang | Hai Leong Chieu
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Singlish can be interesting to the ACL community both linguistically as a major creole based on English, and computationally for information extraction and sentiment analysis of regional social media. We investigate dependency parsing of Singlish by constructing a dependency treebank under the Universal Dependencies scheme, and then training a neural network model by integrating English syntactic knowledge into a state-of-the-art parser trained on the Singlish treebank. Results show that English knowledge can lead to 25% relative error reduction, resulting in a parser of 84.47% accuracies. To the best of our knowledge, we are the first to use neural stacking to improve cross-lingual dependency parsing on low-resource languages. We make both our annotation and parser available for further research.

pdf bib
AMR-to-text Generation with Synchronous Node Replacement Grammar
Linfeng Song | Xiaochang Peng | Yue Zhang | Zhiguo Wang | Daniel Gildea
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

This paper addresses the task of AMR-to-text generation by leveraging synchronous node replacement grammar. During training, graph-to-string rules are learned using a heuristic extraction algorithm. At test time, a graph transducer is applied to collapse input AMRs and generate output sentences. Evaluated on a standard benchmark, our method gives the state-of-the-art result.

pdf bib
Neural Reranking for Named Entity Recognition
Jie Yang | Yue Zhang | Fei Dong
Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017

We propose a neural reranking system for named entity recognition (NER), leverages recurrent neural network models to learn sentence-level patterns that involve named entity mentions. In particular, given an output sentence produced by a baseline NER model, we replace all entity mentions, such as Barack Obama, into their entity types, such as PER. The resulting sentence patterns contain direct output information, yet is less sparse without specific named entities. For example, “PER was born in LOC” can be such a pattern. LSTM and CNN structures are utilised for learning deep representations of such sentences for reranking. Results show that our system can significantly improve the NER accuracies over two different baselines, giving the best reported results on a standard benchmark.

pdf bib
Integrating Order Information and Event Relation for Script Event Prediction
Zhongqing Wang | Yue Zhang | Ching-Yun Chang
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

There has been a recent line of work automatically learning scripts from unstructured texts, by modeling narrative event chains. While the dominant approach group events using event pair relations, LSTMs have been used to encode full chains of narrative events. The latter has the advantage of learning long-range temporal orders, yet the former is more adaptive to partial orders. We propose a neural model that leverages the advantages of both methods, by using LSTM hidden states as features for event pair modelling. A dynamic memory network is utilized to automatically induce weights on existing events for inferring a subsequent event. Standard evaluation shows that our method significantly outperforms both methods above, giving the best results reported so far.

pdf bib
Word-Context Character Embeddings for Chinese Word Segmentation
Hao Zhou | Zhenting Yu | Yue Zhang | Shujian Huang | Xinyu Dai | Jiajun Chen
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Neural parsers have benefited from automatically labeled data via dependency-context word embeddings. We investigate training character embeddings on a word-based context in a similar way, showing that the simple method improves state-of-the-art neural word segmentation models significantly, beating tri-training baselines for leveraging auto-segmented data.

pdf bib
Opinion Recommendation Using A Neural Model
Zhongqing Wang | Yue Zhang
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

We present opinion recommendation, a novel task of jointly generating a review with a rating score that a certain user would give to a certain product which is unreviewed by the user, given existing reviews to the product by other users, and the reviews that the user has given to other products. A characteristic of opinion recommendation is the reliance of multiple data sources for multi-task joint learning. We use a single neural network to model users and products, generating customised product representations using a deep memory network, from which customised ratings and reviews are constructed jointly. Results show that our opinion recommendation system gives ratings that are closer to real user ratings on Yelp.com data compared with Yelp’s own ratings. our methods give better results compared to several pipelines baselines.

pdf bib
End-to-End Neural Relation Extraction with Global Optimization
Meishan Zhang | Yue Zhang | Guohong Fu
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Neural networks have shown promising results for relation extraction. State-of-the-art models cast the task as an end-to-end problem, solved incrementally using a local classifier. Yet previous work using statistical models have demonstrated that global optimization can achieve better performances compared to local classification. We build a globally optimized neural model for end-to-end relation extraction, proposing novel LSTM features in order to better learn context representations. In addition, we present a novel method to integrate syntactic information to facilitate global learning, yet requiring little background on syntactic grammars thus being easy to extend. Experimental results show that our proposed model is highly effective, achieving the best performances on two standard benchmarks.

pdf bib
Transition-Based Disfluency Detection using LSTMs
Shaolei Wang | Wanxiang Che | Yue Zhang | Meishan Zhang | Ting Liu
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

In this paper, we model the problem of disfluency detection using a transition-based framework, which incrementally constructs and labels the disfluency chunk of input sentences using a new transition system without syntax information. Compared with sequence labeling methods, it can capture non-local chunk-level features; compared with joint parsing and disfluency detection methods, it is free for noise in syntax. Experiments show that our model achieves state-of-the-art f-score of 87.5% on the commonly used English Switchboard test set, and a set of in-house annotated Chinese data.

pdf bib
Dependency Parsing with Partial Annotations: An Empirical Comparison
Yue Zhang | Zhenghua Li | Jun Lang | Qingrong Xia | Min Zhang
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

This paper describes and compares two straightforward approaches for dependency parsing with partial annotations (PA). The first approach is based on a forest-based training objective for two CRF parsers, i.e., a biaffine neural network graph-based parser (Biaffine) and a traditional log-linear graph-based parser (LLGPar). The second approach is based on the idea of constrained decoding for three parsers, i.e., a traditional linear graph-based parser (LGPar), a globally normalized neural network transition-based parser (GN3Par) and a traditional linear transition-based parser (LTPar). For the test phase, constrained decoding is also used for completing partial trees. We conduct experiments on Penn Treebank under three different settings for simulating PA, i.e., random, most uncertain, and divergent outputs from the five parsers. The results show that LLGPar is most effective in directly learning from PA, and other parsers can achieve best performance when PAs are completed into full trees by LLGPar.

pdf bib
Implicit Syntactic Features for Target-dependent Sentiment Analysis
Yuze Gao | Yue Zhang | Tong Xiao
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Targeted sentiment analysis investigates the sentiment polarities on given target mentions from input texts. Different from sentence level sentiment, it offers more fine-grained knowledge on each entity mention. While early work leveraged syntactic information, recent research has used neural representation learning to induce features automatically, thereby avoiding error propagation of syntactic parsers, which are particularly severe on social media texts. We study a method to leverage syntactic information without explicitly building the parser outputs, by training an encoder-decoder structure parser model on standard syntactic treebanks, and then leveraging its hidden encoder layers when analysing tweets. Such hidden vectors do not contain explicit syntactic outputs, yet encode rich syntactic features. We use them to augment the inputs to a baseline state-of-the-art targeted sentiment classifier, observing significant improvements on various benchmark datasets. We obtain the best accuracies on all test sets.

pdf bib
Abstractive Multi-document Summarization by Partial Tree Extraction, Recombination and Linearization
Litton J Kurisinkel | Yue Zhang | Vasudeva Varma
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Existing work for abstractive multidocument summarization utilise existing phrase structures directly extracted from input documents to generate summary sentences. These methods can suffer from lack of consistence and coherence in merging phrases. We introduce a novel approach for abstractive multidocument summarization through partial dependency tree extraction, recombination and linearization. The method entrusts the summarizer to generate its own topically coherent sequential structures from scratch for effective communication. Results on TAC 2011, DUC-2004 and 2005 show that our system gives competitive results compared with state of the art abstractive summarization approaches in the literature. We also achieve competitive results in linguistic quality assessed by human evaluators.

pdf bib
Deep Learning in Lexical Analysis and Parsing
Wanxiang Che | Yue Zhang
Proceedings of the IJCNLP 2017, Tutorial Abstracts

Neural networks, also with a fancy name deep learning, just right can overcome the above “feature engineering” problem. In theory, they can use non-linear activation functions and multiple layers to automatically find useful features. The novel network structures, such as convolutional or recurrent, help to reduce the difficulty further. These deep learning models have been successfully used for lexical analysis and parsing. In this tutorial, we will give a review of each line of work, by contrasting them with traditional statistical methods, and organizing them in consistent orders.

pdf bib
Shift-Reduce Constituent Parsing with Neural Lookahead Features
Jiangming Liu | Yue Zhang
Transactions of the Association for Computational Linguistics, Volume 5

Transition-based models can be fast and accurate for constituent parsing. Compared with chart-based models, they leverage richer features by extracting history information from a parser stack, which consists of a sequence of non-local constituents. On the other hand, during incremental parsing, constituent information on the right hand side of the current word is not utilized, which is a relative weakness of shift-reduce parsing. To address this limitation, we leverage a fast neural model to extract lookahead features. In particular, we build a bidirectional LSTM model, which leverages full sentence information to predict the hierarchy of constituents that each word starts and ends. The results are then passed to a strong transition-based constituent parser as lookahead features. The resulting parser gives 1.3% absolute improvement in WSJ and 2.3% in CTB compared to the baseline, giving the highest reported accuracies for fully-supervised parsing.

pdf bib
Head-Lexicalized Bidirectional Tree LSTMs
Zhiyang Teng | Yue Zhang
Transactions of the Association for Computational Linguistics, Volume 5

Sequential LSTMs have been extended to model tree structures, giving competitive results for a number of tasks. Existing methods model constituent trees by bottom-up combinations of constituent nodes, making direct use of input word information only for leaf nodes. This is different from sequential LSTMs, which contain references to input words for each node. In this paper, we propose a method for automatic head-lexicalization for tree-structure LSTMs, propagating head words from leaf nodes to every constituent node. In addition, enabled by head lexicalization, we build a tree LSTM in the top-down direction, which corresponds to bidirectional sequential LSTMs in structure. Experiments show that both extensions give better representations of tree structures. Our final model gives the best results on the Stanford Sentiment Treebank and highly competitive results on the TREC question type classification task.

pdf bib
In-Order Transition-based Constituent Parsing
Jiangming Liu | Yue Zhang
Transactions of the Association for Computational Linguistics, Volume 5

Both bottom-up and top-down strategies have been used for neural transition-based constituent parsing. The parsing strategies differ in terms of the order in which they recognize productions in the derivation tree, where bottom-up strategies and top-down strategies take post-order and pre-order traversal over trees, respectively. Bottom-up parsers benefit from rich features from readily built partial parses, but lack lookahead guidance in the parsing process; top-down parsers benefit from non-local guidance for local decisions, but rely on a strong encoder over the input to predict a constituent hierarchy before its construction. To mitigate both issues, we propose a novel parsing system based on in-order traversal over syntactic trees, designing a set of transition actions to find a compromise between bottom-up constituent information and top-down lookahead information. Based on stack-LSTM, our psycholinguistically motivated constituent parsing system achieves 91.8 F1 on the WSJ benchmark. Furthermore, the system achieves 93.6 F1 with supervised reranking and 94.2 F1 with semi-supervised reranking, which are the best results on the WSJ benchmark.

pdf bib
Transition-Based Deep Input Linearization
Ratish Puduppully | Yue Zhang | Manish Shrivastava
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers

Traditional methods for deep NLG adopt pipeline approaches comprising stages such as constructing syntactic input, predicting function words, linearizing the syntactic input and generating the surface forms. Though easier to visualize, pipeline approaches suffer from error propagation. In addition, information available across modules cannot be leveraged by all modules. We construct a transition-based model to jointly perform linearization, function word prediction and morphological generation, which considerably improves upon the accuracy compared to a pipelined baseline system. On a standard deep input linearization shared task, our system achieves the best results reported so far.

pdf bib
Attention Modeling for Targeted Sentiment
Jiangming Liu | Yue Zhang
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

Neural network models have been used for target-dependent sentiment analysis. Previous work focus on learning a target specific representation for a given input sentence which is used for classification. However, they do not explicitly model the contribution of each word in a sentence with respect to targeted sentiment polarities. We investigate an attention model to this end. In particular, a vanilla LSTM model is used to induce an attention value of the whole sentence. The model is further extended to differentiate left and right contexts given a certain target following previous work. Results show that by using attention to model the contribution of each word with respect to the target, our model gives significantly improved results over two standard benchmarks. We report the best accuracy for this task.

2016

pdf bib
Neural Network for Heterogeneous Annotations
Hongshen Chen | Yue Zhang | Qun Liu
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Exploiting Mutual Benefits between Syntax and Semantic Roles using Neural Network
Peng Shi | Zhiyang Teng | Yue Zhang
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Automatic Features for Essay Scoring – An Empirical Study
Fei Dong | Yue Zhang
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Context-Sensitive Lexicon Features for Neural Sentiment Analysis
Zhiyang Teng | Duy-Tin Vo | Yue Zhang
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
AMR-to-text generation as a Traveling Salesman Problem
Linfeng Song | Yue Zhang | Xiaochang Peng | Zhiguo Wang | Daniel Gildea
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

bib
Neural Networks for Sentiment Analysis
Yue Zhang | Duy Tin Vo
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts

Sentiment analysis has been a major research topic in natural language processing (NLP). Traditionally, the problem has been attacked using discrete models and manually-defined sparse features. Over the past few years, neural network models have received increased research efforts in most sub areas of sentiment analysis, giving highly promising results. A main reason is the capability of neural models to automatically learn dense features that capture subtle semantic information over words, sentences and documents, which are difficult to model using traditional discrete features based on words and ngram patterns. This tutorial gives an introduction to neural network models for sentiment analysis, discussing the mathematics of word embeddings, sequence models and tree structured models and their use in sentiment analysis on the word, sentence and document levels, and fine-grained sentiment analysis. The tutorial covers a range of neural network models (e.g. CNN, RNN, RecNN, LSTM) and their extensions, which are employed in four main subtasks of sentiment analysis:Sentiment-oriented embeddings;Sentence-level sentiment;Document-level sentiment;Fine-grained sentiment.The content of the tutorial is divided into 3 sections of 1 hour each. We assume that the audience is familiar with linear algebra and basic neural network structures, introduce the mathematical details of the most typical models. First, we will introduce the sentiment analysis task, basic concepts related to neural network models for sentiment analysis, and show detail approaches to integrate sentiment information into embeddings. Sentence-level models will be described in the second section. Finally, we will discuss neural network models use for document-level and fine-grained sentiment.

pdf bib
Active Learning for Dependency Parsing with Partial Annotation
Zhenghua Li | Min Zhang | Yue Zhang | Zhanyi Liu | Wenliang Chen | Hua Wu | Haifeng Wang
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Transition-Based Neural Word Segmentation
Meishan Zhang | Yue Zhang | Guohong Fu
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
A Search-Based Dynamic Reranking Model for Dependency Parsing
Hao Zhou | Yue Zhang | Shujian Huang | Junsheng Zhou | Xin-Yu Dai | Jiajun Chen
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Don’t Count, Predict! An Automatic Approach to Learning Sentiment Lexicons for Short Text
Duy Tin Vo | Yue Zhang
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
LibN3L:A Lightweight Package for Neural NLP
Meishan Zhang | Jie Yang | Zhiyang Teng | Yue Zhang
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

We present a light-weight machine learning tool for NLP research. The package supports operations on both discrete and dense vectors, facilitating implementation of linear models as well as neural models. It provides several basic layers which mainly aims for single-layer linear and non-linear transformations. By using these layers, we can conveniently implement linear models and simple neural models. Besides, this package also integrates several complex layers by composing those basic layers, such as RNN, Attention Pooling, LSTM and gated RNN. Those complex layers can be used to implement deep neural models directly.

pdf bib
Evaluating a Deterministic Shift-Reduce Neural Parser for Constituent Parsing
Hao Zhou | Yue Zhang | Shujian Huang | Xin-Yu Dai | Jiajun Chen
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Greedy transition-based parsers are appealing for their very fast speed, with reasonably high accuracies. In this paper, we build a fast shift-reduce neural constituent parser by using a neural network to make local decisions. One challenge to the parsing speed is the large hidden and output layer sizes caused by the number of constituent labels and branching options. We speed up the parser by using a hierarchical output layer, inspired by the hierarchical log-bilinear neural language model. In standard WSJ experiments, the neural parser achieves an almost 2.4 time speed up (320 sen/sec) compared to a non-hierarchical baseline without significant accuracy loss (89.06 vs 89.13 F-score).

pdf bib
Multi-prototype Chinese Character Embedding
Yanan Lu | Yue Zhang | Donghong Ji
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Chinese sentences are written as sequences of characters, which are elementary units of syntax and semantics. Characters are highly polysemous in forming words. We present a position-sensitive skip-gram model to learn multi-prototype Chinese character embeddings, and explore the usefulness of such character embeddings to Chinese NLP tasks. Evaluation on character similarity shows that multi-prototype embeddings are significantly better than a single-prototype baseline. In addition, used as features in the Chinese NER task, the embeddings result in a 1.74% F-score improvement over a state-of-the-art baseline.

pdf bib
Assessing the Prosody of Non-Native Speakers of English: Measures and Feature Sets
Eduardo Coutinho | Florian Hönig | Yue Zhang | Simone Hantke | Anton Batliner | Elmar Nöth | Björn Schuller
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

In this paper, we describe a new database with audio recordings of non-native (L2) speakers of English, and the perceptual evaluation experiment conducted with native English speakers for assessing the prosody of each recording. These annotations are then used to compute the gold standard using different methods, and a series of regression experiments is conducted to evaluate their impact on the performance of a regression model predicting the degree of naturalness of L2 speech. Further, we compare the relevance of different feature groups modelling prosody in general (without speech tempo), speech rate and pauses modelling speech tempo (fluency), voice quality, and a variety of spectral features. We also discuss the impact of various fusion strategies on performance.Overall, our results demonstrate that the prosody of non-native speakers of English as L2 can be reliably assessed using supra-segmental audio features; prosodic features seem to be the most important ones.

pdf bib
Expectation-Regulated Neural Model for Event Mention Extraction
Ching-Yun Chang | Zhiyang Teng | Yue Zhang
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Transition-Based Syntactic Linearization with Lookahead Features
Ratish Puduppully | Yue Zhang | Manish Shrivastava
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Deceptive Opinion Spam Detection Using Neural Network
Yafeng Ren | Yue Zhang
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Deceptive opinion spam detection has attracted significant attention from both business and research communities. Existing approaches are based on manual discrete features, which can capture linguistic and psychological cues. However, such features fail to encode the semantic meaning of a document from the discourse perspective, which limits the performance. In this paper, we empirically explore a neural network model to learn document-level representation for detecting deceptive opinion spam. In particular, given a document, the model learns sentence representations with a convolutional neural network, which are combined using a gated recurrent neural network with attention mechanism to model discourse information and yield a document vector. Finally, the document representation is used directly as features to identify deceptive opinion spam. Experimental results on three domains (Hotel, Restaurant, and Doctor) show that our proposed method outperforms state-of-the-art methods.

pdf bib
A Bilingual Attention Network for Code-switched Emotion Prediction
Zhongqing Wang | Yue Zhang | Sophia Lee | Shoushan Li | Guodong Zhou
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Emotions in code-switching text can be expressed in either monolingual or bilingual forms. However, relatively little research has emphasized on code-switching text. In this paper, we propose a Bilingual Attention Network (BAN) model to aggregate the monolingual and bilingual informative words to form vectors from the document representation, and integrate the attention vectors to predict the emotion. The experiments show that the effectiveness of the proposed model. Visualization of the attention layers illustrates that the model selects qualitatively informative words.

pdf bib
Knowledge-Driven Event Embedding for Stock Prediction
Xiao Ding | Yue Zhang | Ting Liu | Junwen Duan
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Representing structured events as vectors in continuous space offers a new way for defining dense features for natural language processing (NLP) applications. Prior work has proposed effective methods to learn event representations that can capture syntactic and semantic information over text corpus, demonstrating their effectiveness for downstream tasks such as event-driven stock prediction. On the other hand, events extracted from raw texts do not contain background knowledge on entities and relations that they are mentioned. To address this issue, this paper proposes to leverage extra information from knowledge graph, which provides ground truth such as attributes and properties of entities and encodes valuable relations between entities. Specifically, we propose a joint model to combine knowledge graph information into the objective function of an event embedding learning model. Experiments on event similarity and stock market prediction show that our model is more capable of obtaining better event embeddings and making more accurate prediction on stock market volatilities.

pdf bib
Tweet Sarcasm Detection Using Deep Neural Network
Meishan Zhang | Yue Zhang | Guohong Fu
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Sarcasm detection has been modeled as a binary document classification task, with rich features being defined manually over input documents. Traditional models employ discrete manual features to address the task, with much research effect being devoted to the design of effective feature templates. We investigate the use of neural network for tweet sarcasm detection, and compare the effects of the continuous automatic features with discrete manual features. In particular, we use a bi-directional gated recurrent neural network to capture syntactic and semantic information over tweets locally, and a pooling neural network to extract contextual features automatically from history tweets. Results show that neural features give improved accuracies for sarcasm detection, with different error distributions compared with discrete manual features.

pdf bib
Distance Metric Learning for Aspect Phrase Grouping
Shufeng Xiong | Yue Zhang | Donghong Ji | Yinxia Lou
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Aspect phrase grouping is an important task in aspect-level sentiment analysis. It is a challenging problem due to polysemy and context dependency. We propose an Attention-based Deep Distance Metric Learning (ADDML) method, by considering aspect phrase representation as well as context representation. First, leveraging the characteristics of the review text, we automatically generate aspect phrase sample pairs for distant supervision. Second, we feed word embeddings of aspect phrases and their contexts into an attention-based neural network to learn feature representation of contexts. Both aspect phrase embedding and context embedding are used to learn a deep feature subspace for measure the distances between aspect phrases for K-means clustering. Experiments on four review datasets show that the proposed method outperforms state-of-the-art strong baseline methods.

pdf bib
Measuring the Information Content of Financial News
Ching-Yun Chang | Yue Zhang | Zhiyang Teng | Zahn Bozanic | Bin Ke
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Measuring the information content of news text is useful for decision makers in their investments since news information can influence the intrinsic values of companies. We propose a model to automatically measure the information content given news text, trained using news and corresponding cumulative abnormal returns of listed companies. Existing methods in finance literature exploit sentiment signal features, which are limited by not considering factors such as events. We address this issue by leveraging deep neural models to extract rich semantic features from news text. In particular, a novel tree-structured LSTM is used to find target-specific representations of news text given syntax structures. Empirical results show that the neural models can outperform sentiment-based models, demonstrating the effectiveness of recent NLP technology advances for computational finance.

2015

pdf bib
An Empirical Comparison Between N-gram and Syntactic Language Models for Word Ordering
Jiangming Liu | Yue Zhang
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Neural Networks for Open Domain Targeted Sentiment
Meishan Zhang | Yue Zhang | Duy-Tin Vo
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Combining Discrete and Continuous Features for Deterministic Transition-based Dependency Parsing
Meishan Zhang | Yue Zhang
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
A Transition-based Model for Joint Segmentation, POS-tagging and Normalization
Tao Qian | Yue Zhang | Meishan Zhang | Yafeng Ren | Donghong Ji
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Syntactic Dependencies and Distributed Word Representations for Analogy Detection and Mining
Likun Qiu | Yue Zhang | Yanan Lu
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Proceedings of the Eighth SIGHAN Workshop on Chinese Language Processing
Liang-Chih Yu | Zhifang Sui | Yue Zhang | Vincent Ng
Proceedings of the Eighth SIGHAN Workshop on Chinese Language Processing

pdf bib
Discriminative Syntax-Based Word Ordering for Text Generation
Yue Zhang | Stephen Clark
Computational Linguistics, Volume 41, Issue 3 - September 2015

pdf bib
Event-Driven Headline Generation
Rui Sun | Yue Zhang | Meishan Zhang | Donghong Ji
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

pdf bib
A Neural Probabilistic Structured-Prediction Model for Transition-Based Dependency Parsing
Hao Zhou | Yue Zhang | Shujian Huang | Jiajun Chen
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

pdf bib
Transition-Based Syntactic Linearization
Yijia Liu | Yue Zhang | Wanxiang Che | Bing Qin
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2014

pdf bib
Type-Supervised Domain Adaptation for Joint Segmentation and POS-Tagging
Meishan Zhang | Yue Zhang | Wanxiang Che | Ting Liu
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
Multi-view Chinese Treebanking
Likun Qiu | Yue Zhang | Peng Jin | Houfeng Wang
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
Feature Embedding for Dependency Parsing
Wenliang Chen | Yue Zhang | Min Zhang
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
Syntactic SMT Using a Discriminative Text Generation Model
Yue Zhang | Kai Song | Linfeng Song | Jingbo Zhu | Qun Liu
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
Domain Adaptation for CRF-based Chinese Word Segmentation using Free Annotations
Yijia Liu | Yue Zhang | Wanxiang Che | Ting Liu | Fan Wu
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
Using Structured Events to Predict Stock Price Movement: An Empirical Investigation
Xiao Ding | Yue Zhang | Ting Liu | Junwen Duan
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
ZORE: A Syntax-based System for Chinese Open Relation Extraction
Likun Qiu | Yue Zhang
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
Tagging The Web: Building A Robust Web Tagger with Neural Network
Ji Ma | Yue Zhang | Jingbo Zhu
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Shift-Reduce CCG Parsing with a Dependency Model
Wenduan Xu | Stephen Clark | Yue Zhang
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Character-Level Chinese Dependency Parsing
Meishan Zhang | Yue Zhang | Wanxiang Che | Ting Liu
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
On WordNet Semantic Classes and Dependency Parsing
Kepa Bengoetxea | Eneko Agirre | Joakim Nivre | Yue Zhang | Koldo Gojenola
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Punctuation Processing for Projective Dependency Parsing
Ji Ma | Yue Zhang | Jingbo Zhu
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Syntactic Processing Using Global Discriminative Learning and Beam-Search Decoding
Yue Zhang | Meishan Zhang | Ting Liu
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: Tutorials

2013

pdf bib
Semi-Supervised Feature Transformation for Dependency Parsing
Wenliang Chen | Min Zhang | Yue Zhang
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Feature-Rich Segment-Based News Event Detection on Twitter
Yanxia Qin | Yue Zhang | Min Zhang | Dequan Zheng
Proceedings of the Sixth International Joint Conference on Natural Language Processing

pdf bib
Chinese Parsing Exploiting Characters
Meishan Zhang | Yue Zhang | Wanxiang Che | Ting Liu
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Fast and Accurate Shift-Reduce Constituent Parsing
Muhua Zhu | Yue Zhang | Wenliang Chen | Min Zhang | Jingbo Zhu
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Learning to Prune: Context-Sensitive Pruning for Syntactic MT
Wenduan Xu | Yue Zhang | Philip Williams | Philipp Koehn
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2012

pdf bib
Unsupervised Domain Adaptation for Joint Segmentation and POS-Tagging
Yang Liu | Yue Zhang
Proceedings of COLING 2012: Posters

pdf bib
Analyzing the Effect of Global Learning and Beam-Search on Transition-Based Dependency Parsing
Yue Zhang | Joakim Nivre
Proceedings of COLING 2012: Posters

pdf bib
Syntax-Based Word Ordering Incorporating a Large-Scale Language Model
Yue Zhang | Graeme Blackwood | Stephen Clark
Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics

2011

pdf bib
Syntactic Processing Using the Generalized Perceptron and Beam Search
Yue Zhang | Stephen Clark
Computational Linguistics, Volume 37, Issue 1 - March 2011

pdf bib
Syntax-Based Grammaticality Improvement using CCG and Guided Search
Yue Zhang | Stephen Clark
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

pdf bib
Shift-Reduce CCG Parsing
Yue Zhang | Stephen Clark
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Transition-based Dependency Parsing with Rich Non-local Features
Yue Zhang | Joakim Nivre
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

2010

pdf bib
Recent Advances in Dependency Parsing
Qin Iris Wang | Yue Zhang
NAACL HLT 2010 Tutorial Abstracts

pdf bib
A Fast Decoder for Joint Word Segmentation and POS-Tagging Using a Single Discriminative Model
Yue Zhang | Stephen Clark
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

pdf bib
Chart Pruning for Fast Lexicalised-Grammar Parsing
Yue Zhang | Byung-Gyu Ahn | Stephen Clark | Curt Van Wyk | James R. Curran | Laura Rimell
Coling 2010: Posters

2009

pdf bib
Transition-Based Parsing of the Chinese Treebank using a Global Discriminative Model
Yue Zhang | Stephen Clark
Proceedings of the 11th International Conference on Parsing Technologies (IWPT’09)

2008

pdf bib
A Tale of Two Parsers: Investigating and Combining Graph-based and Transition-based Dependency Parsing
Yue Zhang | Stephen Clark
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing

pdf bib
Joint Word Segmentation and POS Tagging Using a Single Perceptron
Yue Zhang | Stephen Clark
Proceedings of ACL-08: HLT

2007

pdf bib
Chinese Segmentation with a Word-Based Perceptron Algorithm
Yue Zhang | Stephen Clark
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

Search
Co-authors