Sheng Zhang


2024

pdf bib
mDPO: Conditional Preference Optimization for Multimodal Large Language Models
Fei Wang | Wenxuan Zhou | James Y. Huang | Nan Xu | Sheng Zhang | Hoifung Poon | Muhao Chen
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Direct preference optimization (DPO) has shown to be an effective method for large language model (LLM) alignment. Recent works have attempted to apply DPO to multimodal scenarios but have found it challenging to achieve consistent improvement. Through a comparative experiment, we identify the unconditional preference problem in multimodal preference optimization, where the model overlooks the image condition. To address this problem, we propose mDPO, a multimodal DPO objective that prevents the over-prioritization of language-only preferences by also optimizing image preference. Moreover, we introduce a reward anchor that forces the reward to be positive for chosen responses, thereby avoiding the decrease in their likelihood—an intrinsic problem of relative preference optimization. Experiments on two multimodal LLMs of different sizes and three widely used benchmarks demonstrate that mDPO effectively addresses the unconditional preference problem in multimodal preference optimization and significantly improves model performance, particularly in reducing hallucination.

pdf bib
Code Membership Inference for Detecting Unauthorized Data Use in Code Pre-trained Language Models
Sheng Zhang | Hui Li | Rongrong Ji
Findings of the Association for Computational Linguistics: EMNLP 2024

Code pre-trained language models (CPLMs) have received great attention since they can benefit various tasks that facilitate software development and maintenance. However, CPLMs are trained on massive open-source code, raising concerns about potential data infringement. This paper launches the study of detecting unauthorized code use in CPLMs, i.e., Code Membership Inference (CMI) task. We design a framework Buzzer for different settings of CMI. Buzzer deploys several inference techniques, including signal extraction from pre-training tasks, hard-to-learn sample calibration and weighted inference, to identify code membership status accurately. Extensive experiments show that CMI can be achieved with high accuracy using Buzzer. Hence, Buzzer can serve as a CMI tool and help protect intellectual property rights. The implementation of Buzzer is available at: https://github.com/KDEGroup/Buzzer

pdf bib
DocLens: Multi-aspect Fine-grained Medical Text Evaluation
Yiqing Xie | Sheng Zhang | Hao Cheng | Pengfei Liu | Zelalem Gero | Cliff Wong | Tristan Naumann | Hoifung Poon | Carolyn Rose
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Medical text generation aims to assist with administrative work and highlight salient information to support decision-making.To reflect the specific requirements of medical text, in this paper, we propose a set of metrics to evaluate the completeness, conciseness, and attribution of the generated text at a fine-grained level. The metrics can be computed by various types of evaluators including instruction-following (both proprietary and open-source) and supervised entailment models. We demonstrate the effectiveness of the resulting framework, DocLens, with three evaluators on three tasks: clinical note generation, radiology report summarization, and patient question summarization. A comprehensive human study shows that DocLens exhibits substantially higher agreement with the judgments of medical experts than existing metrics. The results also highlight the need to improve open-source evaluators and suggest potential directions. We released the code at https://github.com/yiqingxyq/DocLens.

2023

pdf bib
Continual Contrastive Finetuning Improves Low-Resource Relation Extraction
Wenxuan Zhou | Sheng Zhang | Tristan Naumann | Muhao Chen | Hoifung Poon
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Relation extraction (RE), which has relied on structurally annotated corpora for model training, has been particularly challenging in low-resource scenarios and domains. Recent literature has tackled low-resource RE by self-supervised learning, where the solution involves pretraining the entity pair embedding by RE-based objective and finetuning on labeled data by classification-based objective. However, a critical challenge to this approach is the gap in objectives, which prevents the RE model from fully utilizing the knowledge in pretrained representations. In this paper, we aim at bridging the gap and propose to pretrain and finetune the RE model using consistent objectives of contrastive learning. Since in this kind of representation learning paradigm, one relation may easily form multiple clusters in the representation space, we further propose a multi-center contrastive loss that allows one relation to form multiple clusters to better align with pretraining. Experiments on two document-level RE datasets, BioRED and Re-DocRED, demonstrate the effectiveness of our method. Particularly, when using 1% end-task training data, our method outperforms PLM-based RE classifier by 10.5% and 6.1% on the two datasets, respectively.

pdf bib
Importance of Synthesizing High-quality Data for Text-to-SQL Parsing
Yiqun Hu | Yiyun Zhao | Jiarong Jiang | Wuwei Lan | Henghui Zhu | Anuj Chauhan | Alexander Hanbo Li | Lin Pan | Jun Wang | Chung-Wei Hang | Sheng Zhang | Jiang Guo | Mingwen Dong | Joseph Lilien | Patrick Ng | Zhiguo Wang | Vittorio Castelli | Bing Xiang
Findings of the Association for Computational Linguistics: ACL 2023

There has been increasing interest in synthesizing data to improve downstream text-to-SQL tasks. In this paper, we examined the existing synthesized datasets and discovered that state-of-the-art text-to-SQL algorithms did not further improve on popular benchmarks when trained with augmented synthetic data. We observed three shortcomings: illogical synthetic SQL queries from independent column sampling, arbitrary table joins, and language gaps between the synthesized SQL and natural language question (NLQ) pair. To address these issues, we propose a novel synthesis framework that imposes strong typing constraints, incorporates key relationships from schema, and conducts schema-distance-weighted column sampling. We also adopt an intermediate representation (IR) for the SQL-to-text task to further improve the quality of the generated NLQ. When existing powerful text-to-SQL parsers are pretrained on our high-quality synthesized data, these models have significant accuracy boosts and achieve new state-of-the-art performance on Spider. We also demonstrate the effectiveness of our techniques with ablation studies

pdf bib
Generate then Select: Open-ended Visual Question Answering Guided by World Knowledge
Xingyu Fu | Sheng Zhang | Gukyeong Kwon | Pramuditha Perera | Henghui Zhu | Yuhao Zhang | Alexander Hanbo Li | William Yang Wang | Zhiguo Wang | Vittorio Castelli | Patrick Ng | Dan Roth | Bing Xiang
Findings of the Association for Computational Linguistics: ACL 2023

The open-ended Visual Question Answering (VQA) task requires AI models to jointly reason over visual and natural language inputs using world knowledge. Recently, pre-trained Language Models (PLM) such as GPT-3 have been applied to the task and shown to be powerful world knowledge sources. However, these methods suffer from low knowledge coverage caused by PLM bias – the tendency to generate certain tokens over other tokens regardless of prompt changes, and high dependency on the PLM quality – only models using GPT-3 can achieve the best result. To address the aforementioned challenges, we propose RASO: a new VQA pipeline that deploys a generate-then-select strategy guided by world knowledge for the first time. Rather than following the de facto standard to train a multi-modal model that directly generates the VQA answer, {pasted macro ‘MODEL’}name first adopts PLM to generate all the possible answers, and then trains a lightweight answer selection model for the correct answer. As proved in our analysis, RASO expands the knowledge coverage from in-domain training data by a large margin. We provide extensive experimentation and show the effectiveness of our pipeline by advancing the state-of-the-art by 4.1% on OK-VQA, without additional computation cost.

pdf bib
Benchmarking Diverse-Modal Entity Linking with Generative Models
Sijia Wang | Alexander Hanbo Li | Henghui Zhu | Sheng Zhang | Pramuditha Perera | Chung-Wei Hang | Jie Ma | William Yang Wang | Zhiguo Wang | Vittorio Castelli | Bing Xiang | Patrick Ng
Findings of the Association for Computational Linguistics: ACL 2023

Entities can be expressed in diverse formats, such as texts, images, or column names and cell values in tables. While existing entity linking (EL) models work well on per modality configuration, such as text-only EL, visual grounding or schema linking, it is more challenging to design a unified model for diverse modality configurations. To bring various modality configurations together, we constructed a benchmark for diverse-modal EL (DMEL) from existing EL datasets, covering all three modalities including text, image and table. To approach the DMEL task, we proposed a generative diverse-modal model (GDMM) following a multimodal-encoder-decoder paradigm. Pre-training GDMM with rich corpora builds a solid foundation for DMEL without storing the entire KB for inference. Fine-tuning GDMM builds a stronger DMEL baseline, outperforming state-of-the-art task-specific EL models by 8.51 F1 score on average. Additionally, extensive error analyses are conducted to highlight the challenge of DMEL, facilitating future researches on this task.

pdf bib
Context-faithful Prompting for Large Language Models
Wenxuan Zhou | Sheng Zhang | Hoifung Poon | Muhao Chen
Findings of the Association for Computational Linguistics: EMNLP 2023

Large language models (LLMs) encode parametric knowledge about world facts and have shown remarkable performance in knowledge-driven NLP tasks. However, their reliance on parametric knowledge may cause them to overlook contextual cues, leading to incorrect predictions in context-sensitive NLP tasks (e.g., knowledge acquisition tasks). In this paper, we seek to assess and enhance LLMs’ contextual faithfulness in two aspects: knowledge conflict and prediction with abstention. We demonstrate that LLMs’ faithfulness can be significantly improved using carefully designed prompting strategies. In particular, we identify opinion-based prompts and counterfactual demonstrations as the most effective methods. Opinion-based prompts reframe the context as a narrator’s statement and inquire about the narrator’s opinions, while counterfactual demonstrations use instances containing false facts to improve faithfulness in knowledge conflict situations. Neither technique requires additional training. We conduct experiments on three datasets of two standard NLP tasks, machine reading comprehension and relation extraction, and the results demonstrate significant improvement in faithfulness to contexts. Code and data are released at https://github.com/wzhouad/context-faithful-llm.

pdf bib
Interactive Span Recommendation for Biomedical Text
Louis Blankemeier | Theodore Zhao | Robert Tinn | Sid Kiblawi | Yu Gu | Akshay Chaudhari | Hoifung Poon | Sheng Zhang | Mu Wei | J. Preston
Proceedings of the 5th Clinical Natural Language Processing Workshop

Motivated by the scarcity of high-quality labeled biomedical text, as well as the success of data programming, we introduce KRISS-Search. By leveraging the Unified Medical Language Systems (UMLS) ontology, KRISS-Search addresses an interactive few-shot span recommendation task that we propose. We first introduce unsupervised KRISS-Search and show that our method outperforms existing methods in identifying spans that are semantically similar to a given span of interest, with >50% AUPRC improvement relative to PubMedBERT. We then introduce supervised KRISS-Search, which leverages human interaction to improve the notion of similarity used by unsupervised KRISS-Search. Through simulated human feedback, we demonstrate an enhanced F1 score of 0.68 in classifying spans as semantically similar or different in the low-label setting, outperforming PubMedBERT by 2 F1 points. Finally, supervised KRISS-Search demonstrates competitive or superior performance compared to PubMedBERT in few-shot biomedical named entity recognition (NER) across five benchmark datasets, with an average improvement of 5.6 F1 points. We envision KRISS-Search increasing the efficiency of programmatic data labeling and also providing broader utility as an interactive biomedical search engine.

pdf bib
Compositional Zero-Shot Domain Transfer with Text-to-Text Models
Fangyu Liu | Qianchu Liu | Shruthi Bannur | Fernando Pérez-García | Naoto Usuyama | Sheng Zhang | Tristan Naumann | Aditya Nori | Hoifung Poon | Javier Alvarez-Valle | Ozan Oktay | Stephanie L. Hyland
Transactions of the Association for Computational Linguistics, Volume 11

Label scarcity is a bottleneck for improving task performance in specialized domains. We propose a novel compositional transfer learning framework (DoT51) for zero-shot domain transfer. Without access to in-domain labels, DoT5 jointly learns domain knowledge (from masked language modelling of unlabelled in-domain free text) and task knowledge (from task training on more readily available general-domain data) in a multi-task manner. To improve the transferability of task training, we design a strategy named NLGU: We simultaneously train natural language generation (NLG) for in-domain label-to-data generation, which enables data augmentation for self-finetuning and natural language understanding (NLU) for label prediction. We evaluate DoT5 on the biomedical domain and the resource-lean subdomain of radiology, focusing on natural language inference, text summarization, and embedding learning. DoT5 demonstrates the effectiveness of compositional transfer learning through multi-task learning. In particular, DoT5 outperforms the current state-of-the-art in zero-shot transfer by over 7 absolute points in accuracy on RadNLI. We validate DoT5 with ablations and a case study demonstrating its ability to solve challenging NLI examples requiring in-domain expertise.

2022

pdf bib
Locally Aggregated Feature Attribution on Natural Language Model Understanding
Sheng Zhang | Jin Wang | Haitao Jiang | Rui Song
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

With the growing popularity of deep-learning models, model understanding becomes more important. Much effort has been devoted to demystify deep neural networks for better explainability. Some feature attribution methods have shown promising results in computer vision, especially the gradient-based methods where effectively smoothing the gradients with reference data is the key to a robust and faithful result. However, direct application of these gradient-based methods to NLP tasks is not trivial due to the fact that the input consists of discrete tokens and the “reference” tokens are not explicitly defined. In this work, we propose Locally Aggregated Feature Attribution (LAFA), a novel gradient-based feature attribution method for NLP models. Instead of relying on obscure reference tokens, it smooths gradients by aggregating similar reference texts derived from language model embeddings. For evaluation purpose, we also design experiments on different NLP tasks including Entity Recognition and Sentiment Analysis on public datasets and key words detection on constructed Amazon catalogue dataset. The superior performance of the proposed method is demonstrated through experiments.

pdf bib
FairLex: A Multilingual Benchmark for Evaluating Fairness in Legal Text Processing
Ilias Chalkidis | Tommaso Pasini | Sheng Zhang | Letizia Tomada | Sebastian Schwemer | Anders Søgaard
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We present a benchmark suite of four datasets for evaluating the fairness of pre-trained language models and the techniques used to fine-tune them for downstream tasks. Our benchmarks cover four jurisdictions (European Council, USA, Switzerland, and China), five languages (English, German, French, Italian and Chinese) and fairness across five attributes (gender, age, region, language, and legal area). In our experiments, we evaluate pre-trained language models using several group-robust fine-tuning techniques and show that performance group disparities are vibrant in many cases, while none of these techniques guarantee fairness, nor consistently mitigate group disparities. Furthermore, we provide a quantitative and qualitative analysis of our results, highlighting open challenges in the development of robustness methods in legal NLP.

pdf bib
Zero-Shot Dependency Parsing with Worst-Case Aware Automated Curriculum Learning
Miryam de Lhoneux | Sheng Zhang | Anders Søgaard
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Large multilingual pretrained language models such as mBERT and XLM-RoBERTa have been found to be surprisingly effective for cross-lingual transfer of syntactic parsing models Wu and Dredze (2019), but only between related languages. However, source and training languages are rarely related, when parsing truly low-resource languages. To close this gap, we adopt a method from multi-task learning, which relies on automated curriculum learning, to dynamically optimize for parsing performance on outlier languages. We show that this approach is significantly better than uniform and size-proportional sampling in the zero-shot setting.

pdf bib
CLLE: A Benchmark for Continual Language Learning Evaluation in Multilingual Machine Translation
Han Zhang | Sheng Zhang | Yang Xiang | Bin Liang | Jinsong Su | Zhongjian Miao | Hui Wang | Ruifeng Xu
Findings of the Association for Computational Linguistics: EMNLP 2022

Continual Language Learning (CLL) in multilingual translation is inevitable when new languages are required to be translated. Due to the lack of unified and generalized benchmarks, the evaluation of existing methods is greatly influenced by experimental design which usually has a big gap from the industrial demands. In this work, we propose the first Continual Language Learning Evaluation benchmark CLLE in multilingual translation. CLLE consists of a Chinese-centric corpus — CN-25 and two CLL tasks — the close-distance language continual learning task and the language family continual learning task designed for real and disparate demands. Different from existing translation benchmarks, CLLE considers several restrictions for CLL, including domain distribution alignment, content overlap, language diversity, and the balance of corpus. Furthermore, we propose a novel framework COMETA based on Constrained Optimization and META-learning to alleviate catastrophic forgetting and dependency on history training data by using a meta-model to retain the important parameters for old languages. Our experiments prove that CLLE is a challenging CLL benchmark and that our proposed method is effective when compared with other strong baselines. Due to the construction of the corpus, the task designing and the evaluation method are independent of the centric language, we also construct and release the English-centric corpus EN-25 to facilitate academic research.

pdf bib
Knowledge-Rich Self-Supervision for Biomedical Entity Linking
Sheng Zhang | Hao Cheng | Shikhar Vashishth | Cliff Wong | Jinfeng Xiao | Xiaodong Liu | Tristan Naumann | Jianfeng Gao | Hoifung Poon
Findings of the Association for Computational Linguistics: EMNLP 2022

Entity linking faces significant challenges such as prolific variations and prevalent ambiguities, especially in high-value domains with myriad entities. Standard classification approaches suffer from the annotation bottleneck and cannot effectively handle unseen entities. Zero-shot entity linking has emerged as a promising direction for generalizing to new entities, but it still requires example gold entity mentions during training and canonical descriptions for all entities, both of which are rarely available outside of Wikipedia. In this paper, we explore Knowledge-RIch Self-Supervision (KRISS) for biomedical entity linking, by leveraging readily available domain knowledge. In training, it generates self-supervised mention examples on unlabeled text using a domain ontology and trains a contextual encoder using contrastive learning. For inference, it samples self-supervised mentions as prototypes for each entity and conducts linking by mapping the test mention to the most similar prototype. Our approach can easily incorporate entity descriptions and gold mention labels if available. We conducted extensive experiments on seven standard datasets spanning biomedical literature and clinical notes. Without using any labeled information, our method produces KRISSBERT, a universal entity linker for four million UMLS entities that attains new state of the art, outperforming prior self-supervised methods by as much as 20 absolute points in accuracy. We released KRISSBERT at https://aka.ms/krissbert.

2021

pdf bib
Joint Universal Syntactic and Semantic Parsing
Elias Stengel-Eskin | Kenton Murray | Sheng Zhang | Aaron Steven White | Benjamin Van Durme
Transactions of the Association for Computational Linguistics, Volume 9

While numerous attempts have been made to jointly parse syntax and semantics, high performance in one domain typically comes at the price of performance in the other. This trade-off contradicts the large body of research focusing on the rich interactions at the syntax–semantics interface. We explore multiple model architectures that allow us to exploit the rich syntactic and semantic annotations contained in the Universal Decompositional Semantics (UDS) dataset, jointly parsing Universal Dependencies and UDS to obtain state-of-the-art results in both formalisms. We analyze the behavior of a joint model of syntax and semantics, finding patterns supported by linguistic theory at the syntax–semantics interface. We then investigate to what degree joint modeling generalizes to a multilingual setting, where we find similar trends across 8 languages.

pdf bib
Sociolectal Analysis of Pretrained Language Models
Sheng Zhang | Xin Zhang | Weiming Zhang | Anders Søgaard
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Using data from English cloze tests, in which subjects also self-reported their gender, age, education, and race, we examine performance differences of pretrained language models across demographic groups, defined by these (protected) attributes. We demonstrate wide performance gaps across demographic groups and show that pretrained language models systematically disfavor young non-white male speakers; i.e., not only do pretrained language models learn social biases (stereotypical associations) – pretrained language models also learn sociolectal biases, learning to speak more like some than like others. We show, however, that, with the exception of BERT models, larger pretrained language models reduce some the performance gaps between majority and minority groups.

pdf bib
Modular Self-Supervision for Document-Level Relation Extraction
Sheng Zhang | Cliff Wong | Naoto Usuyama | Sarthak Jain | Tristan Naumann | Hoifung Poon
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Extracting relations across large text spans has been relatively underexplored in NLP, but it is particularly important for high-value domains such as biomedicine, where obtaining high recall of the latest findings is crucial for practical applications. Compared to conventional information extraction confined to short text spans, document-level relation extraction faces additional challenges in both inference and learning. Given longer text spans, state-of-the-art neural architectures are less effective and task-specific self-supervision such as distant supervision becomes very noisy. In this paper, we propose decomposing document-level relation extraction into relation detection and argument resolution, taking inspiration from Davidsonian semantics. This enables us to incorporate explicit discourse modeling and leverage modular self-supervision for each sub-problem, which is less noise-prone and can be further refined end-to-end via variational EM. We conduct a thorough evaluation in biomedical machine reading for precision oncology, where cross-paragraph relation mentions are prevalent. Our method outperforms prior state of the art, such as multi-scale learning and graph neural networks, by over 20 absolute F1 points. The gain is particularly pronounced among the most challenging relation instances whose arguments never co-occur in a paragraph.

2020

pdf bib
The Universal Decompositional Semantics Dataset and Decomp Toolkit
Aaron Steven White | Elias Stengel-Eskin | Siddharth Vashishtha | Venkata Subrahmanyan Govindarajan | Dee Ann Reisinger | Tim Vieira | Keisuke Sakaguchi | Sheng Zhang | Francis Ferraro | Rachel Rudinger | Kyle Rawlins | Benjamin Van Durme
Proceedings of the Twelfth Language Resources and Evaluation Conference

We present the Universal Decompositional Semantics (UDS) dataset (v1.0), which is bundled with the Decomp toolkit (v0.1). UDS1.0 unifies five high-quality, decompositional semantics-aligned annotation sets within a single semantic graph specification—with graph structures defined by the predicative patterns produced by the PredPatt tool and real-valued node and edge attributes constructed using sophisticated normalization procedures. The Decomp toolkit provides a suite of Python 3 tools for querying UDS graphs using SPARQL. Both UDS1.0 and Decomp0.1 are publicly available at http://decomp.io.

pdf bib
Universal Decompositional Semantic Parsing
Elias Stengel-Eskin | Aaron Steven White | Sheng Zhang | Benjamin Van Durme
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

We introduce a transductive model for parsing into Universal Decompositional Semantics (UDS) representations, which jointly learns to map natural language utterances into UDS graph structures and annotate the graph with decompositional semantic attribute scores. We also introduce a strong pipeline model for parsing into the UDS graph structure, and show that our transductive parser performs comparably while additionally performing attribute prediction. By analyzing the attribute prediction errors, we find the model captures natural relationships between attribute groups.

pdf bib
Knowledge-guided Open Attribute Value Extraction with Reinforcement Learning
Ye Liu | Sheng Zhang | Rui Song | Suo Feng | Yanghua Xiao
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Open attribute value extraction for emerging entities is an important but challenging task. A lot of previous works formulate the problem as a question-answering (QA) task. While the collections of articles from web corpus provide updated information about the emerging entities, the retrieved texts can be noisy, irrelevant, thus leading to inaccurate answers. Effectively filtering out noisy articles as well as bad answers is the key to improve extraction accuracy. Knowledge graph (KG), which contains rich, well organized information about entities, provides a good resource to address the challenge. In this work, we propose a knowledge-guided reinforcement learning (RL) framework for open attribute value extraction. Informed by relevant knowledge in KG, we trained a deep Q-network to sequentially compare extracted answers to improve extraction accuracy. The proposed framework is applicable to different information extraction system. Our experimental results show that our method outperforms the baselines by 16.5 - 27.8%.

2019

pdf bib
AMR Parsing as Sequence-to-Graph Transduction
Sheng Zhang | Xutai Ma | Kevin Duh | Benjamin Van Durme
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

We propose an attention-based model that treats AMR parsing as sequence-to-graph transduction. Unlike most AMR parsers that rely on pre-trained aligners, external semantic resources, or data augmentation, our proposed parser is aligner-free, and it can be effectively trained with limited amounts of labeled AMR data. Our experimental results outperform all previously reported SMATCH scores, on both AMR 2.0 (76.3% on LDC2017T10) and AMR 1.0 (70.2% on LDC2014T12).

pdf bib
Unsupervised Deep Structured Semantic Models for Commonsense Reasoning
Shuohang Wang | Sheng Zhang | Yelong Shen | Xiaodong Liu | Jingjing Liu | Jianfeng Gao | Jing Jiang
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Commonsense reasoning is fundamental to natural language understanding. While traditional methods rely heavily on human-crafted features and knowledge bases, we explore learning commonsense knowledge from a large amount of raw text via unsupervised learning. We propose two neural network models based on the Deep Structured Semantic Models (DSSM) framework to tackle two classic commonsense reasoning tasks, Winograd Schema challenges (WSC) and Pronoun Disambiguation (PDP). Evaluation shows that the proposed models effectively capture contextual information in the sentence and co-reference information between pronouns and nouns, and achieve significant improvement over previous state-of-the-art approaches.

pdf bib
Broad-Coverage Semantic Parsing as Transduction
Sheng Zhang | Xutai Ma | Kevin Duh | Benjamin Van Durme
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

We unify different broad-coverage semantic parsing tasks into a transduction parsing paradigm, and propose an attention-based neural transducer that incrementally builds meaning representation via a sequence of semantic relations. By leveraging multiple attention mechanisms, the neural transducer can be effectively trained without relying on a pre-trained aligner. Experiments separately conducted on three broad-coverage semantic parsing tasks – AMR, SDP and UCCA – demonstrate that our attention-based neural transducer improves the state of the art on both AMR and UCCA, and is competitive with the state of the art on SDP.

pdf bib
Proceedings of the First Workshop on Commonsense Inference in Natural Language Processing
Simon Ostermann | Sheng Zhang | Michael Roth | Peter Clark
Proceedings of the First Workshop on Commonsense Inference in Natural Language Processing

pdf bib
Commonsense Inference in Natural Language Processing (COIN) - Shared Task Report
Simon Ostermann | Sheng Zhang | Michael Roth | Peter Clark
Proceedings of the First Workshop on Commonsense Inference in Natural Language Processing

This paper reports on the results of the shared tasks of the COIN workshop at EMNLP-IJCNLP 2019. The tasks consisted of two machine comprehension evaluations, each of which tested a system’s ability to answer questions/queries about a text. Both evaluations were designed such that systems need to exploit commonsense knowledge, for example, in the form of inferences over information that is available in the common ground but not necessarily mentioned in the text. A total of five participating teams submitted systems for the shared tasks, with the best submitted system achieving 90.6% accuracy and 83.7% F1-score on task 1 and task 2, respectively.

pdf bib
Deep Generalized Canonical Correlation Analysis
Adrian Benton | Huda Khayrallah | Biman Gujral | Dee Ann Reisinger | Sheng Zhang | Raman Arora
Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)

We present Deep Generalized Canonical Correlation Analysis (DGCCA) – a method for learning nonlinear transformations of arbitrarily many views of data, such that the resulting transformations are maximally informative of each other. While methods for nonlinear two view representation learning (Deep CCA, (Andrew et al., 2013)) and linear many-view representation learning (Generalized CCA (Horst, 1961)) exist, DGCCA combines the flexibility of nonlinear (deep) representation learning with the statistical power of incorporating information from many sources, or views. We present the DGCCA formulation as well as an efficient stochastic optimization algorithm for solving it. We learn and evaluate DGCCA representations for three downstream tasks: phonetic transcription from acoustic & articulatory measurements, recommending hashtags and recommending friends on a dataset of Twitter users.

2018

pdf bib
CRF-LSTM Text Mining Method Unveiling the Pharmacological Mechanism of Off-target Side Effect of Anti-Multiple Myeloma Drugs
Kaiyin Zhou | Sheng Zhang | Xiangyu Meng | Qi Luo | Yuxing Wang | Ke Ding | Yukun Feng | Mo Chen | Kevin Cohen | Jingbo Xia
Proceedings of the BioNLP 2018 workshop

Sequence labeling of biomedical entities, e.g., side effects or phenotypes, was a long-term task in BioNLP and MedNLP communities. Thanks to effects made among these communities, adverse reaction NER has developed dramatically in recent years. As an illuminative application, to achieve knowledge discovery via the combination of the text mining result and bioinformatics idea shed lights on the pharmacological mechanism research.

pdf bib
Neural-Davidsonian Semantic Proto-role Labeling
Rachel Rudinger | Adam Teichert | Ryan Culkin | Sheng Zhang | Benjamin Van Durme
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

We present a model for semantic proto-role labeling (SPRL) using an adapted bidirectional LSTM encoding strategy that we call NeuralDavidsonian: predicate-argument structure is represented as pairs of hidden states corresponding to predicate and argument head tokens of the input sequence. We demonstrate: (1) state-of-the-art results in SPRL, and (2) that our network naturally shares parameters between attributes, allowing for learning new attribute types with limited added supervision.

pdf bib
Cross-lingual Decompositional Semantic Parsing
Sheng Zhang | Xutai Ma | Rachel Rudinger | Kevin Duh | Benjamin Van Durme
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

We introduce the task of cross-lingual decompositional semantic parsing: mapping content provided in a source language into a decompositional semantic analysis based on a target language. We present: (1) a form of decompositional semantic analysis designed to allow systems to target varying levels of structural complexity (shallow to deep analysis), (2) an evaluation metric to measure the similarity between system output and reference semantic analysis, (3) an end-to-end model with a novel annotating mechanism that supports intra-sentential coreference, and (4) an evaluation dataset on which our model outperforms strong baselines by at least 1.75 F1 score.

pdf bib
Halo: Learning Semantics-Aware Representations for Cross-Lingual Information Extraction
Hongyuan Mei | Sheng Zhang | Kevin Duh | Benjamin Van Durme
Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics

Cross-lingual information extraction (CLIE) is an important and challenging task, especially in low resource scenarios. To tackle this challenge, we propose a training method, called Halo, which enforces the local region of each hidden state of a neural model to only generate target tokens with the same semantic structure tag. This simple but powerful technique enables a neural model to learn semantics-aware representations that are robust to noise, without introducing any extra parameter, thus yielding better generalization in both high and low resource settings.

pdf bib
Fine-grained Entity Typing through Increased Discourse Context and Adaptive Classification Thresholds
Sheng Zhang | Kevin Duh | Benjamin Van Durme
Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics

Fine-grained entity typing is the task of assigning fine-grained semantic types to entity mentions. We propose a neural architecture which learns a distributional semantic representation that leverages a greater amount of semantic context – both document and sentence level information – than prior work. We find that additional context improves performance, with further improvements gained by utilizing adaptive classification thresholds. Experiments show that our approach without reliance on hand-crafted features achieves the state-of-the-art results on three benchmark datasets.

2017

pdf bib
FuRongWang at SemEval-2017 Task 3: Deep Neural Networks for Selecting Relevant Answers in Community Question Answering
Sheng Zhang | Jiajun Cheng | Hui Wang | Xin Zhang | Pei Li | Zhaoyun Ding
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

We describes deep neural networks frameworks in this paper to address the community question answering (cQA) ranking task (SemEval-2017 task 3). Convolutional neural networks and bi-directional long-short term memory networks are applied in our methods to extract semantic information from questions and answers (comments). In addition, in order to take the full advantage of question-comment semantic relevance, we deploy interaction layer and augmented features before calculating the similarity. The results show that our methods have the great effectiveness for both subtask A and subtask C.

pdf bib
Selective Decoding for Cross-lingual Open Information Extraction
Sheng Zhang | Kevin Duh | Benjamin Van Durme
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Cross-lingual open information extraction is the task of distilling facts from the source language into representations in the target language. We propose a novel encoder-decoder model for this problem. It employs a novel selective decoding mechanism, which explicitly models the sequence labeling process as well as the sequence generation process on the decoder side. Compared to a standard encoder-decoder model, selective decoding significantly increases the performance on a Chinese-English cross-lingual open IE dataset by 3.87-4.49 BLEU and 1.91-5.92 F1. We also extend our approach to low-resource scenarios, and gain promising improvement.

pdf bib
MT/IE: Cross-lingual Open Information Extraction with Neural Sequence-to-Sequence Models
Sheng Zhang | Kevin Duh | Benjamin Van Durme
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

Cross-lingual information extraction is the task of distilling facts from foreign language (e.g. Chinese text) into representations in another language that is preferred by the user (e.g. English tuples). Conventional pipeline solutions decompose the task as machine translation followed by information extraction (or vice versa). We propose a joint solution with a neural sequence model, and show that it outperforms the pipeline in a cross-lingual open information extraction setting by 1-4 BLEU and 0.5-0.8 F1.

pdf bib
Ordinal Common-sense Inference
Sheng Zhang | Rachel Rudinger | Kevin Duh | Benjamin Van Durme
Transactions of the Association for Computational Linguistics, Volume 5

Humans have the capacity to draw common-sense inferences from natural language: various things that are likely but not certain to hold based on established discourse, and are rarely stated explicitly. We propose an evaluation of automated common-sense inference based on an extension of recognizing textual entailment: predicting ordinal human responses on the subjective likelihood of an inference holding in a given context. We describe a framework for extracting common-sense knowledge from corpora, which is then used to construct a dataset for this ordinal entailment task. We train a neural sequence-to-sequence model on this dataset, which we use to score and generate possible inferences. Further, we annotate subsets of previously established datasets via our ordinal annotation protocol in order to then analyze the distinctions between these and what we have constructed.

pdf bib
An Evaluation of PredPatt and Open IE via Stage 1 Semantic Role Labeling
Sheng Zhang | Rachel Rudinger | Benjamin Van Durme
Proceedings of the 12th International Conference on Computational Semantics (IWCS) — Short papers

2016

pdf bib
Universal Decompositional Semantics on Universal Dependencies
Aaron Steven White | Drew Reisinger | Keisuke Sakaguchi | Tim Vieira | Sheng Zhang | Rachel Rudinger | Kyle Rawlins | Benjamin Van Durme
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

2015

pdf bib
Semantic Interpretation of Superlative Expressions via Structured Knowledge Bases
Sheng Zhang | Yansong Feng | Songfang Huang | Kun Xu | Zhe Han | Dongyan Zhao
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

Search
Co-authors