Zeyu Zhang


pdf bib
Double Retrieval and Ranking for Accurate Question Answering
Zeyu Zhang | Thuy Vu | Alessandro Moschitti
Findings of the Association for Computational Linguistics: EACL 2023

Recent work has shown that an answer verification step introduced in Transformer-based answer selection models can significantly improve the state of the art in Question Answering. This step is performed by aggregating the embeddings of top k answer candidates to support the verification of a target answer. Although the approach is intuitive and sound, it still shows two limitations: (i) the supporting candidates are ranked only according to the relevancy with the question and not with the answer, and (ii) the support provided by the other answer candidates is suboptimal as these are retrieved independently of the target answer. In this paper, we address both drawbacks by proposing (i) a double reranking model, which, for each target answer, selects the best support; and (ii) a second neural retrieval stage designed to encode question and answer pair as the query, which finds more specific verification information. The results on well-known datasets for Answer Sentence Selection show significant improvement over the state of the art.

pdf bib
Hallucination Mitigation in Natural Language Generation from Large-Scale Open-Domain Knowledge Graphs
Xiao Shi | Zhengyuan Zhu | Zeyu Zhang | Chengkai Li
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

In generating natural language descriptions for knowledge graph triples, prior works used either small-scale, human-annotated datasets or datasets with limited variety of graph shapes, e.g., those having mostly star graphs. Graph-to-text models trained and evaluated on such datasets are largely not assessed for more realistic large-scale, open-domain settings. We introduce a new dataset, GraphNarrative, to fill this gap. Fine-tuning transformer-based pre-trained language models has achieved state-of-the-art performance among graph-to-text models. However, this method suffers from information hallucination—the generated text may contain fabricated facts not present in input graphs. We propose a novel approach that, given a graph-sentence pair in GraphNarrative, trims the sentence to eliminate portions that are not present in the corresponding graph, by utilizing the sentence’s dependency parse tree. Our experiment results verify this approach using models trained on GraphNarrative and existing datasets. The dataset, source code, and trained models are released at https://github.com/idirlab/graphnarrator.

pdf bib
Improving Toponym Resolution with Better Candidate Generation, Transformer-based Reranking, and Two-Stage Resolution
Zeyu Zhang | Steven Bethard
Proceedings of the 12th Joint Conference on Lexical and Computational Semantics (*SEM 2023)

Geocoding is the task of converting location mentions in text into structured data that encodes the geospatial semantics. We propose a new architecture for geocoding, GeoNorm. GeoNorm first uses information retrieval techniques to generate a list of candidate entries from the geospatial ontology. Then it reranks the candidate entries using a transformer-based neural network that incorporates information from the ontology such as the entry’s population. This generate-and-rerank process is applied twice: first to resolve the less ambiguous countries, states, and counties, and second to resolve the remaining location mentions, using the identified countries, states, and counties as context. Our proposed toponym resolution framework achieves state-of-the-art performance on multiple datasets. Code and models are available at \url{https://github.com/clulab/geonorm}.


pdf bib
Taxonomy Builder: a Data-driven and User-centric Tool for Streamlining Taxonomy Construction
Mihai Surdeanu | John Hungerford | Yee Seng Chan | Jessica MacBride | Benjamin Gyori | Andrew Zupon | Zheng Tang | Haoling Qiu | Bonan Min | Yan Zverev | Caitlin Hilverman | Max Thomas | Walter Andrews | Keith Alcock | Zeyu Zhang | Michael Reynolds | Steven Bethard | Rebecca Sharp | Egoitz Laparra
Proceedings of the Second Workshop on Bridging Human--Computer Interaction and Natural Language Processing

An existing domain taxonomy for normalizing content is often assumed when discussing approaches to information extraction, yet often in real-world scenarios there is none. When one does exist, as the information needs shift, it must be continually extended. This is a slow and tedious task, and one which does not scale well. Here we propose an interactive tool that allows a taxonomy to be built or extended rapidly and with a human in the loop to control precision. We apply insights from text summarization and information extraction to reduce the search space dramatically, then leverage modern pretrained language models to perform contextualized clustering of the remaining concepts to yield candidate nodes for the user to review. We show this allows a user to consider as many as 200 taxonomy concept candidates an hour, to quickly build or extend a taxonomy to better fit information needs.


pdf bib
A Dashboard for Mitigating the COVID-19 Misinfodemic
Zhengyuan Zhu | Kevin Meng | Josue Caraballo | Israa Jaradat | Xiao Shi | Zeyu Zhang | Farahnaz Akrami | Haojin Liao | Fatma Arslan | Damian Jimenez | Mohanmmed Samiul Saeef | Paras Pathak | Chengkai Li
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations

This paper describes the current milestones achieved in our ongoing project that aims to understand the surveillance of, impact of and intervention on COVID-19 misinfodemic on Twitter. Specifically, it introduces a public dashboard which, in addition to displaying case counts in an interactive map and a navigational panel, also provides some unique features not found in other places. Particularly, the dashboard uses a curated catalog of COVID-19 related facts and debunks of misinformation, and it displays the most prevalent information from the catalog among Twitter users in user-selected U.S. geographic regions. The paper explains how to use BERT models to match tweets with the facts and misinformation and to detect their stance towards such information. The paper also discusses the results of preliminary experiments on analyzing the spatio-temporal spread of misinformation.

pdf bib
Joint Models for Answer Verification in Question Answering Systems
Zeyu Zhang | Thuy Vu | Alessandro Moschitti
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

This paper studies joint models for selecting correct answer sentences among the top k provided by answer sentence selection (AS2) modules, which are core components of retrieval-based Question Answering (QA) systems. Our work shows that a critical step to effectively exploiting an answer set regards modeling the interrelated information between pair of answers. For this purpose, we build a three-way multi-classifier, which decides if an answer supports, refutes, or is neutral with respect to another one. More specifically, our neural architecture integrates a state-of-the-art AS2 module with the multi-classifier, and a joint layer connecting all components. We tested our models on WikiQA, TREC-QA, and a real-world dataset. The results show that our models obtain the new state of the art in AS2.


pdf bib
ScienceExamCER: A High-Density Fine-Grained Science-Domain Corpus for Common Entity Recognition
Hannah Smith | Zeyu Zhang | John Culnan | Peter Jansen
Proceedings of the Twelfth Language Resources and Evaluation Conference

Named entity recognition identifies common classes of entities in text, but these entity labels are generally sparse, limiting utility to downstream tasks. In this work we present ScienceExamCER, a densely-labeled semantic classification corpus of 133k mentions in the science exam domain where nearly all (96%) of content words have been annotated with one or more fine-grained semantic class labels including taxonomic groups, meronym groups, verb/action groups, properties and values, and synonyms. Semantic class labels are drawn from a manually-constructed fine-grained typology of 601 classes generated through a data-driven analysis of 4,239 science exam questions. We show an off-the-shelf BERT-based named entity recognition model modified for multi-label classification achieves an accuracy of 0.85 F1 on this task, suggesting strong utility for downstream tasks in science domain question answering requiring densely-labeled semantic classification.

pdf bib
A Generate-and-Rank Framework with Semantic Type Regularization for Biomedical Concept Normalization
Dongfang Xu | Zeyu Zhang | Steven Bethard
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Concept normalization, the task of linking textual mentions of concepts to concepts in an ontology, is challenging because ontologies are large. In most cases, annotated datasets cover only a small sample of the concepts, yet concept normalizers are expected to predict all concepts in the ontology. In this paper, we propose an architecture consisting of a candidate generator and a list-wise ranker based on BERT. The ranker considers pairings of concept mentions and candidate concepts, allowing it to make predictions for any concept, not just those seen during training. We further enhance this list-wise approach with a semantic type regularizer that allows the model to incorporate semantic type information from the ontology during training. Our proposed concept normalization framework achieves state-of-the-art performance on multiple datasets.