WonJin Yoon

Also published as: Wonjin Yoon

2025

Overview of the 2025 Shared Task on Chemotherapy Treatment Timeline Extraction
Jiarui Yao | Harry Hochheiser | WonJin Yoon | Eli T Goldner | Guergana K Savova
Proceedings of the 7th Clinical Natural Language Processing Workshop

Extracting patient treatment timelines from clinical notes is a complex task involving identification of relevant events, temporal expressions, and temporal relations in individual documents and developing cross-document summaries. The 2025 Shared Task on Chemotherapy Treatment Timeline Extraction builds upon the initial 2024 challenge, using data from 57,530 breast and ovarian cancer patients and 15,946 melanoma patients. Participants were provided with a subset annotated for treatment entities, temporal expressions, temporal relations, and timelines for each patient. This training data was used to addressed two subtasks. Subtask 1 focused on extracting temporal relations and creating timelines, given documents and gold-standard events and temporal expressions. Sutask 2 involved development of an end-to-end system involving extraction of entities, temporal expressions, and relations, and construction of timelines, given only the Electronic Health Record notes. Five teams participated, submitting eight entries for Subtask 1 and twelve for Subtask 2. Supervised fine-tuning remains a productive approach albeit with a shift of supervised fine-tuning of very large language models compared to the 2024 task edition. Even with the much more “strict” evaluation metric, the best results are comparable to the best less strict 2024 relaxed-to-month results.

pdf bib abs

Recent progress in large language models (LLMs) has enabled the automated processing of lengthy documents even without supervised training on a task-specific dataset. Yet, their zero-shot performance in complex tasks as opposed to straightforward information extraction tasks remains suboptimal. One feasible approach for tasks with lengthy, complex input is to first summarize the document and then apply supervised fine-tuning to the summary. However, the summarization process inevitably results in some loss of information. In this study we present a method for processing the summaries of long documents aimed to capture different important aspects of the original document. We hypothesize that LLM summaries generated with different aspect-oriented prompts contain different information signals, and we propose methods to measure these differences. We introduce approaches to effectively integrate signals from these different summaries for supervised training of transformer models. We validate our hypotheses on a high-impact task – 30-day readmission prediction from a psychiatric discharge – using real-world data from four hospitals, and show that our proposed method increases the prediction performance for the complex task of predicting patient outcome.

pdf bib abs

Using tournaments to calculate AUROC for zero-shot classification with LLMs
WonJin Yoon | Ian Bulovic | Timothy A. Miller
Findings of the Association for Computational Linguistics: EMNLP 2025

Large language models perform surprisingly well on many zero-shot classification tasks, but are difficult to fairly compare to supervised classifiers due to the lack of a modifiable decision boundary. In this work, we propose and evaluate a method that transforms binary classification tasks into pairwise comparisons between instances within a dataset, using LLMs to produce relative rankings of those instances. Repeated pairwise comparisons can be used to score instances using the Elo rating system (used in chess and other competitions), inducing a confidence ordering over instances in a dataset. We evaluate scheduling algorithms for their ability to minimize comparisons, and show that our proposed algorithm leads to improved classification performance, while also providing more information than traditional zero-shot classification.

2024

pdf bib abs

Overview of the 2024 Shared Task on Chemotherapy Treatment Timeline Extraction
Jiarui Yao | Harry Hochheiser | WonJin Yoon | Eli Goldner | Guergana Savova
Proceedings of the 6th Clinical Natural Language Processing Workshop

The 2024 Shared Task on Chemotherapy Treatment Timeline Extraction aims to advance the state of the art of clinical event timeline extraction from the Electronic Health Records (EHRs). Specifically, this edition focuses on chemotherapy event timelines from EHRs of patients with breast, ovarian and skin cancers. These patient-level timelines present a novel challenge which involves tasks such as the extraction of relevant events, time expressions and temporal relations from each document and then summarizing over the documents. De-identified EHRs for 57,530 patients with breast and ovarian cancer spanning 2004-2020, and approximately 15,946 patients with melanoma spanning 2010-2020 were made available to participants after executing a Data Use Agreement. A subset of patients is annotated for gold entities, time expressions, temporal relations and patient-level timelines. The rest is considered unlabeled data. In Subtask1, gold chemotherapy event mentions and time expressions are provided (along with the EHR notes). Participants are asked to build the patient-level timelines using gold annotations as input. Thus, the subtask seeks to explore the topics of temporal relations extraction and timeline creation if event and time expression input is perfect. In Subtask2, which is the realistic real-world setting, only EHR notes are provided. Thus, the subtask aims at developing an end-to-end system for chemotherapy treatment timeline extraction from patient’s EHR notes. There were 18 submissions for Subtask 1 and 9 submissions for Subtask 2. The organizers provided a baseline system. The teams employed a variety of methods including Logistic Regression, TF-IDF, n-grams, transformer models, zero-shot prompting with Large Language Models (LLMs), and instruction tuning. The gap in performance between prompting LLMs and finetuning smaller-sized LMs indicates that for a challenging task such as patient-level chemotherapy timeline extraction, more sophisticated LLMs or prompting techniques are necessary in order to achieve optimal results as finetuing smaller-sized LMs outperforms by a wide margin.

2022

pdf bib abs

Biomedical NER for the Enterprise with Distillated BERN2 and the Kazu Framework
Wonjin Yoon | Richard Jackson | Elliot Ford | Vladimir Poroshin | Jaewoo Kang
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Industry Track

In order to assist the drug discovery/development process, pharmaceutical companies often apply biomedical NER and linking techniques over internal and public corpora. Decades of study of the field of BioNLP has produced a plethora of algorithms, systems and datasets. However, our experience has been that no single open source system meets all the requirements of a modern pharmaceutical company. In this work, we describe these requirements according to our experience of the industry, and present Kazu, a highly extensible, scalable open source framework designed to support BioNLP for the pharmaceutical sector. Kazu is a built around a computationally efficient version of the BERN2 NER model (TinyBERN2), and subsequently wraps several other BioNLP technologies into one coherent system.

pdf bib abs

KU_ED at SocialDisNER: Extracting Disease Mentions in Tweets Written in Spanish
Antoine Lain | Wonjin Yoon | Hyunjae Kim | Jaewoo Kang | Ian Simpson
Proceedings of the Seventh Workshop on Social Media Mining for Health Applications, Workshop & Shared Task

This paper describes our system developed for the Social Media Mining for Health (SMM4H) 2022 SocialDisNER task. We used several types of pre-trained language models, which are trained on Spanish biomedical literature or Spanish Tweets. We showed the difference in performance depending on the quality of the tokenization as well as introducing silver standard annotations when training the model. Our model obtained a strict F1 of 80.3% on the test set, which is an improvement of +12.8% F1 (24.6 std) over the average results across all submissions to the SocialDisNER challenge.

2020

pdf bib abs

The recent outbreak of the novel coronavirus is wreaking havoc on the world and researchers are struggling to effectively combat it. One reason why the fight is difficult is due to the lack of information and knowledge. In this work, we outline our effort to contribute to shrinking this knowledge vacuum by creating covidAsk, a question answering (QA) system that combines biomedical text mining and QA techniques to provide answers to questions in real-time. Our system also leverages information retrieval (IR) approaches to provide entity-level answers that are complementary to QA models. Evaluation of covidAsk is carried out by using a manually created dataset called COVID-19 Questions which is based on information from various sources, including the CDC and the WHO. We hope our system will be able to aid researchers in their search for knowledge and information not only for COVID-19, but for future pandemics as well.

Co-authors

Venues

smm4h1

Fix author