Yuanzhe Zhang


2022

pdf bib
Logic Traps in Evaluating Attribution Scores
Yiming Ju | Yuanzhe Zhang | Zhao Yang | Zhongtao Jiang | Kang Liu | Jun Zhao
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Modern deep learning models are notoriously opaque, which has motivated the development of methods for interpreting how deep models predict.This goal is usually approached with attribution method, which assesses the influence of features on model predictions. As an explanation method, the evaluation criteria of attribution methods is how accurately it reflects the actual reasoning process of the model (faithfulness). Meanwhile, since the reasoning process of deep models is inaccessible, researchers design various evaluation methods to demonstrate their arguments.However, some crucial logic traps in these evaluation methods are ignored in most works, causing inaccurate evaluation and unfair comparison.This paper systematically reviews existing methods for evaluating attribution scores and summarizes the logic traps in these methods.We further conduct experiments to demonstrate the existence of each logic trap.Through both theoretical and experimental analysis, we hope to increase attention on the inaccurate evaluation of attribution scores. Moreover, with this paper, we suggest stopping focusing on improving performance under unreliable evaluation systems and starting efforts on reducing the impact of proposed logic traps.

pdf bib
CMQA: A Dataset of Conditional Question Answering with Multiple-Span Answers
Yiming Ju | Weikang Wang | Yuanzhe Zhang | Suncong Zheng | Kang Liu | Jun Zhao
Proceedings of the 29th International Conference on Computational Linguistics

Forcing the answer of the Question Answering (QA) task to be a single text span might be restrictive since the answer can be multiple spans in the context. Moreover, we found that multi-span answers often appear with two characteristics when building the QA system for a real-world application. First, multi-span answers might be caused by users lacking domain knowledge and asking ambiguous questions, which makes the question need to be answered with conditions. Second, there might be hierarchical relations among multiple answer spans. Some recent span-extraction QA datasets include multi-span samples, but they only contain unconditional and parallel answers, which cannot be used to tackle this problem. To bridge the gap, we propose a new task: conditional question answering with hierarchical multi-span answers, where both the hierarchical relations and the conditions need to be extracted. Correspondingly, we introduce CMQA, a Conditional Multiple-span Chinese Question Answering dataset to study the new proposed task. The final release of CMQA consists of 7,861 QA pairs and 113,089 labels, where all samples contain multi-span answers, 50.4% of samples are conditional, and 56.6% of samples are hierarchical. CMQA can serve as a benchmark to study the new proposed task and help study building QA systems for real-world applications. The low performance of models drawn from related literature shows that the new proposed task is challenging for the community to solve.

2021

pdf bib
Biomedical Concept Normalization by Leveraging Hypernyms
Cheng Yan | Yuanzhe Zhang | Kang Liu | Jun Zhao | Yafei Shi | Shengping Liu
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Biomedical Concept Normalization (BCN) is widely used in biomedical text processing as a fundamental module. Owing to numerous surface variants of biomedical concepts, BCN still remains challenging and unsolved. In this paper, we exploit biomedical concept hypernyms to facilitate BCN. We propose Biomedical Concept Normalizer with Hypernyms (BCNH), a novel framework that adopts list-wise training to make use of both hypernyms and synonyms, and also employs norm constraint on the representation of hypernym-hyponym entity pairs. The experimental results show that BCNH outperforms the previous state-of-the-art model on the NCBI dataset.

pdf bib
Enhancing Multiple-choice Machine Reading Comprehension by Punishing Illogical Interpretations
Yiming Ju | Yuanzhe Zhang | Zhixing Tian | Kang Liu | Xiaohuan Cao | Wenting Zhao | Jinlong Li | Jun Zhao
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Machine Reading Comprehension (MRC), which requires a machine to answer questions given the relevant documents, is an important way to test machines’ ability to understand human language. Multiple-choice MRC is one of the most studied tasks in MRC due to the convenience of evaluation and the flexibility of answer format. Post-hoc interpretation aims to explain a trained model and reveal how the model arrives at the prediction. One of the most important interpretation forms is to attribute model decisions to input features. Based on post-hoc interpretation methods, we assess attributions of paragraphs in multiple-choice MRC and improve the model by punishing the illogical attributions. Our method can improve model performance without any external information and model structure change. Furthermore, we also analyze how and why such a self-training method works.

pdf bib
Alignment Rationale for Natural Language Inference
Zhongtao Jiang | Yuanzhe Zhang | Zhao Yang | Jun Zhao | Kang Liu
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Deep learning models have achieved great success on the task of Natural Language Inference (NLI), though only a few attempts try to explain their behaviors. Existing explanation methods usually pick prominent features such as words or phrases from the input text. However, for NLI, alignments among words or phrases are more enlightening clues to explain the model. To this end, this paper presents AREC, a post-hoc approach to generate alignment rationale explanations for co-attention based models in NLI. The explanation is based on feature selection, which keeps few but sufficient alignments while maintaining the same prediction of the target model. Experimental results show that our method is more faithful and human-readable compared with many existing approaches. We further study and re-evaluate three typical models through our explanation beyond accuracy, and propose a simple method that greatly improves the model robustness.

2020

pdf bib
MIE: A Medical Information Extractor towards Medical Dialogues
Yuanzhe Zhang | Zhongtao Jiang | Tao Zhang | Shiwan Liu | Jiarun Cao | Kang Liu | Shengping Liu | Jun Zhao
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Electronic Medical Records (EMRs) have become key components of modern medical care systems. Despite the merits of EMRs, many doctors suffer from writing them, which is time-consuming and tedious. We believe that automatically converting medical dialogues to EMRs can greatly reduce the burdens of doctors, and extracting information from medical dialogues is an essential step. To this end, we annotate online medical consultation dialogues in a window-sliding style, which is much easier than the sequential labeling annotation. We then propose a Medical Information Extractor (MIE) towards medical dialogues. MIE is able to extract mentioned symptoms, surgeries, tests, other information and their corresponding status. To tackle the particular challenges of the task, MIE uses a deep matching architecture, taking dialogue turn-interaction into account. The experimental results demonstrate MIE is a promising solution to extract medical information from doctor-patient dialogues.

pdf bib
Scene Restoring for Narrative Machine Reading Comprehension
Zhixing Tian | Yuanzhe Zhang | Kang Liu | Jun Zhao | Yantao Jia | Zhicheng Sheng
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

This paper focuses on machine reading comprehension for narrative passages. Narrative passages usually describe a chain of events. When reading this kind of passage, humans tend to restore a scene according to the text with their prior knowledge, which helps them understand the passage comprehensively. Inspired by this behavior of humans, we propose a method to let the machine imagine a scene during reading narrative for better comprehension. Specifically, we build a scene graph by utilizing Atomic as the external knowledge and propose a novel Graph Dimensional-Iteration Network (GDIN) to encode the graph. We conduct experiments on the ROCStories, a dataset of Story Cloze Test (SCT), and CosmosQA, a dataset of multiple choice. Our method achieves state-of-the-art.

2019

pdf bib
Machine Reading Comprehension Using Structural Knowledge Graph-aware Network
Delai Qiu | Yuanzhe Zhang | Xinwei Feng | Xiangwen Liao | Wenbin Jiang | Yajuan Lyu | Kang Liu | Jun Zhao
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Leveraging external knowledge is an emerging trend in machine comprehension task. Previous work usually utilizes knowledge graphs such as ConceptNet as external knowledge, and extracts triples from them to enhance the initial representation of the machine comprehension context. However, such method cannot capture the structural information in the knowledge graph. To this end, we propose a Structural Knowledge Graph-aware Network(SKG) model, constructing sub-graphs for entities in the machine comprehension context. Our method dynamically updates the representation of the knowledge according to the structural information of the constructed sub-graph. Experiments show that SKG achieves state-of-the-art performance on the ReCoRD dataset.

2017

pdf bib
An End-to-End Model for Question Answering over Knowledge Base with Cross-Attention Combining Global Knowledge
Yanchao Hao | Yuanzhe Zhang | Kang Liu | Shizhu He | Zhanyi Liu | Hua Wu | Jun Zhao
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

With the rapid growth of knowledge bases (KBs) on the web, how to take full advantage of them becomes increasingly important. Question answering over knowledge base (KB-QA) is one of the promising approaches to access the substantial knowledge. Meanwhile, as the neural network-based (NN-based) methods develop, NN-based KB-QA has already achieved impressive results. However, previous work did not put more emphasis on question representation, and the question is converted into a fixed vector regardless of its candidate answers. This simple representation strategy is not easy to express the proper information in the question. Hence, we present an end-to-end neural network model to represent the questions and their corresponding scores dynamically according to the various candidate answer aspects via cross-attention mechanism. In addition, we leverage the global knowledge inside the underlying KB, aiming at integrating the rich KB information into the representation of the answers. As a result, it could alleviates the out-of-vocabulary (OOV) problem, which helps the cross-attention model to represent the question more precisely. The experimental results on WebQuestions demonstrate the effectiveness of the proposed approach.

2014

pdf bib
Question Answering over Linked Data Using First-order Logic
Shizhu He | Kang Liu | Yuanzhe Zhang | Liheng Xu | Jun Zhao
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)