Elliot Schumacher


2023

pdf bib
On the Surprising Effectiveness of Name Matching Alone in Autoregressive Entity Linking
Elliot Schumacher | James Mayfield | Mark Dredze
Proceedings of the First Workshop on Matching From Unstructured and Structured Data (MATCHING 2023)

Fifteen years of work on entity linking has established the importance of different information sources in making linking decisions: mention and entity name similarity, contextual relevance, and features of the knowledge base. Modern state-of-the-art systems build on these features, including through neural representations (Wu et al., 2020). In contrast to this trend, the autoregressive language model GENRE (De Cao et al., 2021) generates normalized entity names for mentions and beats many other entity linking systems, despite making no use of knowledge base (KB) information. How is this possible? We analyze the behavior of GENRE on several entity linking datasets and demonstrate that its performance stems from memorization of name patterns. In contrast, it fails in cases that might benefit from using the KB. We experiment with a modification to the model to enable it to utilize KB information, highlighting challenges to incorporating traditional entity linking information sources into autoregressive models.

pdf bib
Generating medically-accurate summaries of patient-provider dialogue: A multi-stage approach using large language models
Varun Nair | Elliot Schumacher | Anitha Kannan
Proceedings of the 5th Clinical Natural Language Processing Workshop

A medical provider’s summary of a patient visit serves several critical purposes, including clinical decision-making, facilitating hand-offs between providers, and as a reference for the patient. An effective summary is required to be coherent and accurately capture all the medically relevant information in the dialogue, despite the complexity of patient-generated language. Even minor inaccuracies in visit summaries (for example, summarizing “patient does not have a fever” when a fever is present) can be detrimental to the outcome of care for the patient. This paper tackles the problem of medical conversation summarization by discretizing the task into several smaller dialogue-understanding tasks that are sequentially built upon. First, we identify medical entities and their affirmations within the conversation to serve as building blocks. We study dynamically constructing few-shot prompts for tasks by conditioning on relevant patient information and use GPT-3 as the backbone for our experiments. We also develop GPT-derived summarization metrics to measure performance against reference summaries quantitatively. Both our human evaluation study and metrics for medical correctness show that summaries generated using this approach are clinically accurate and outperform the baseline approach of summarizing the dialog in a zero-shot, single-prompt setting.

2022

pdf bib
Zero-shot Cross-Language Transfer of Monolingual Entity Linking Models
Elliot Schumacher | James Mayfield | Mark Dredze
Proceedings of the The 2nd Workshop on Multi-lingual Representation Learning (MRL)

Most entity linking systems, whether mono or multilingual, link mentions to a single English knowledge base. Few have considered linking non-English text to a non-English KB, and therefore, transferring an English entity linking model to both a new document and KB language. We consider the task of zero-shot cross-language transfer of entity linking systems to a new language and KB. We find that a system trained with multilingual representations does reasonably well, and propose improvements to system training that lead to improved recall in most datasets, often matching the in-language performance. We further conduct a detailed evaluation to elucidate the challenges of this setting.

2021

pdf bib
Cross-Lingual Transfer in Zero-Shot Cross-Language Entity Linking
Elliot Schumacher | James Mayfield | Mark Dredze
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2020

pdf bib
Clinical Concept Linking with Contextualized Neural Representations
Elliot Schumacher | Andriy Mulyar | Mark Dredze
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

In traditional approaches to entity linking, linking decisions are based on three sources of information – the similarity of the mention string to an entity’s name, the similarity of the context of the document to the entity, and broader information about the knowledge base (KB). In some domains, there is little contextual information present in the KB and thus we rely more heavily on mention string similarity. We consider one example of this, concept linking, which seeks to link mentions of medical concepts to a medical concept ontology. We propose an approach to concept linking that leverages recent work in contextualized neural models, such as ELMo (Peters et al. 2018), which create a token representation that integrates the surrounding context of the mention and concept name. We find a neural ranking approach paired with contextualized embeddings provides gains over a competitive baseline (Leaman et al. 2013). Additionally, we find that a pre-training step using synonyms from the ontology offers a useful initialization for the ranker.

2016

pdf bib
Predicting the Relative Difficulty of Single Sentences With and Without Surrounding Context
Elliot Schumacher | Maxine Eskenazi | Gwen Frishkoff | Kevyn Collins-Thompson
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing