Meliha Yetisgen-Yildiz

Also published as: Meliha Yetisgen, Meliha Yetişgen


pdf bib
Building blocks for complex tasks: Robust generative event extraction for radiology reports under domain shifts
Sitong Zhou | Meliha Yetisgen | Mari Ostendorf
Proceedings of the 5th Clinical Natural Language Processing Workshop

This paper explores methods for extracting information from radiology reports that generalize across exam modalities to reduce requirements for annotated data. We demonstrate that multi-pass T5-based text-to-text generative models exhibit better generalization across exam modalities compared to approaches that employ BERT-based task-specific classification layers. We then develop methods that reduce the inference cost of the model, making large-scale corpus processing more feasible for clinical applications. Specifically, we introduce a generative technique that decomposes complex tasks into smaller subtask blocks, which improves a single-pass model when combined with multitask training. In addition, we leverage target-domain contexts during inference to enhance domain adaptation, enabling use of smaller models. Analyses offer insights into the benefits of different cost reduction strategies.

pdf bib
Prompt-based Extraction of Social Determinants of Health Using Few-shot Learning
Giridhar Kaushik Ramachandran | Yujuan Fu | Bin Han | Kevin Lybarger | Nic Dobbins | Ozlem Uzuner | Meliha Yetisgen
Proceedings of the 5th Clinical Natural Language Processing Workshop

Social determinants of health (SDOH) documented in the electronic health record through unstructured text are increasingly being studied to understand how SDOH impacts patient health outcomes. In this work, we utilize the Social History Annotation Corpus (SHAC), a multi-institutional corpus of de-identified social history sections annotated for SDOH, including substance use, employment, and living status information. We explore the automatic extraction of SDOH information with SHAC in both standoff and inline annotation formats using GPT-4 in a one-shot prompting setting. We compare GPT-4 extraction performance with a high-performing supervised approach and perform thorough error analyses. Our prompt-based GPT-4 method achieved an overall 0.652 F1 on the SHAC test set, similar to the 7th best-performing system among all teams in the n2c2 challenge with SHAC.

pdf bib
Overview of the MEDIQA-Chat 2023 Shared Tasks on the Summarization & Generation of Doctor-Patient Conversations
Asma Ben Abacha | Wen-wai Yim | Griffin Adams | Neal Snider | Meliha Yetisgen
Proceedings of the 5th Clinical Natural Language Processing Workshop

Automatic generation of clinical notes from doctor-patient conversations can play a key role in reducing daily doctors’ workload and improving their interactions with the patients. MEDIQA-Chat 2023 aims to advance and promote research on effective solutions through shared tasks on the automatic summarization of doctor-patient conversations and on the generation of synthetic dialogues from clinical notes for data augmentation. Seventeen teams participated in the challenge and experimented with a broad range of approaches and models. In this paper, we describe the three MEDIQA-Chat 2023 tasks, the datasets, and the participants’ results and methods. We hope that these shared tasks will lead to additional research efforts and insights on the automatic generation and evaluation of clinical notes.


pdf bib
Towards Automating Medical Scribing : Clinic Visit Dialogue2Note Sentence Alignment and Snippet Summarization
Wen-wai Yim | Meliha Yetisgen
Proceedings of the Second Workshop on Natural Language Processing for Medical Conversations

Medical conversations from patient visits are routinely summarized into clinical notes for documentation of clinical care. The automatic creation of clinical note is particularly challenging given that it requires summarization over spoken language and multiple speaker turns; as well, clinical notes include highly technical semi-structured text. In this paper, we describe our corpus creation method and baseline systems for two NLP tasks, clinical dialogue2note sentence alignment and clinical dialogue2note snippet summarization. These two systems, as well as other models created from such a corpus, may be incorporated as parts of an overall end-to-end clinical note generation system.


pdf bib
Alignment Annotation for Clinic Visit Dialogue to Clinical Note Sentence Language Generation
Wen-wai Yim | Meliha Yetisgen | Jenny Huang | Micah Grossman
Proceedings of the Twelfth Language Resources and Evaluation Conference

For every patient’s visit to a clinician, a clinical note is generated documenting their medical conversation, including complaints discussed, treatments, and medical plans. Despite advances in natural language processing, automating clinical note generation from a clinic visit conversation is a largely unexplored area of research. Due to the idiosyncrasies of the task, traditional methods of corpus creation are not effective enough approaches for this problem. In this paper, we present an annotation methodology that is content- and technique- agnostic while associating note sentences to sets of dialogue sentences. The sets can further be grouped with higher order tags to mark sets with related information. This direct linkage from input to output decouples the annotation from specific language understanding or generation strategies. Here we provide data statistics and qualitative analysis describing the unique annotation challenges. Given enough annotated data, such a resource would support multiple modeling methods including information extraction with template language generation, information retrieval type language generation, or sequence to sequence modeling.


pdf bib
Clinical Event Detection with Hybrid Neural Architecture
Adyasha Maharana | Meliha Yetisgen
BioNLP 2017

Event detection from clinical notes has been traditionally solved with rule based and statistical natural language processing (NLP) approaches that require extensive domain knowledge and feature engineering. In this paper, we have explored the feasibility of approaching this task with recurrent neural networks, clinical word embeddings and introduced a hybrid architecture to improve detection for entities with smaller representation in the dataset. A comparative analysis is also done which reveals the complementary behavior of neural networks and conditional random fields in clinical entity detection.


pdf bib
Annotating and Detecting Medical Events in Clinical Notes
Prescott Klassen | Fei Xia | Meliha Yetisgen
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Early detection and treatment of diseases that onset after a patient is admitted to a hospital, such as pneumonia, is critical to improving and reducing costs in healthcare. Previous studies (Tepper et al., 2013) showed that change-of-state events in clinical notes could be important cues for phenotype detection. In this paper, we extend the annotation schema proposed in (Klassen et al., 2014) to mark change-of-state events, diagnosis events, coordination, and negation. After we have completed the annotation, we build NLP systems to automatically identify named entities and medical events, which yield an f-score of 94.7% and 91.8%, respectively.


pdf bib
In-depth annotation for patient level liver cancer staging
Wen-wai Yim | Sharon Kwan | Meliha Yetisgen
Proceedings of the Sixth International Workshop on Health Text Mining and Information Analysis

pdf bib
Annotation of Clinically Important Follow-up Recommendations in Radiology Reports
Meliha Yetisgen | Prescott Klassen | Lucas McCarthy | Elena Pellicer | Tom Payne | Martin Gunn
Proceedings of the Sixth International Workshop on Health Text Mining and Information Analysis


pdf bib
Annotating Clinical Events in Text Snippets for Phenotype Detection
Prescott Klassen | Fei Xia | Lucy Vanderwende | Meliha Yetisgen
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

Early detection and treatment of diseases that onset after a patient is admitted to a hospital, such as pneumonia, is critical to improving and reducing costs in healthcare. NLP systems that analyze the narrative data embedded in clinical artifacts such as x-ray reports can help support early detection. In this paper, we consider the importance of identifying the change of state for events - in particular, clinical events that measure and compare the multiple states of a patient’s health across time. We propose a schema for event annotation comprised of five fields and create preliminary annotation guidelines for annotators to apply the schema. We then train annotators, measure their performance, and finalize our guidelines. With the complete guidelines, we then annotate a corpus of snippets extracted from chest x-ray reports in order to integrate the annotations as a new source of features for classification tasks.

pdf bib
Biomedical/Clinical NLP
Ozlem Uzuner | Meliha Yetişgen | Amber Stubbs
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Tutorial Abstracts


pdf bib
Annotating Change of State for Clinical Events
Lucy Vanderwende | Fei Xia | Meliha Yetisgen-Yildiz
Workshop on Events: Definition, Detection, Coreference, and Representation

pdf bib
Identification of Patients with Acute Lung Injury from Free-Text Chest X-Ray Reports
Meliha Yetisgen-Yildiz | Cosmin Bejan | Mark Wurfel
Proceedings of the 2013 Workshop on Biomedical Natural Language Processing


pdf bib
Statistical Section Segmentation in Free-Text Clinical Records
Michael Tepper | Daniel Capurro | Fei Xia | Lucy Vanderwende | Meliha Yetisgen-Yildiz
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

Automatically segmenting and classifying clinical free text into sections is an important first step to automatic information retrieval, information extraction and data mining tasks, as it helps to ground the significance of the text within. In this work we describe our approach to automatic section segmentation of clinical records such as hospital discharge summaries and radiology reports, along with section classification into pre-defined section categories. We apply machine learning to the problems of section segmentation and section classification, comparing a joint (one-step) and a pipeline (two-step) approach. We demonstrate that our systems perform well when tested on three data sets, two for hospital discharge summaries and one for radiology reports. We then show the usefulness of section information by incorporating it in the task of extracting comorbidities from discharge summaries.


pdf bib
Annotating Large Email Datasets for Named Entity Recognition with Mechanical Turk
Nolan Lawson | Kevin Eustice | Mike Perkowitz | Meliha Yetisgen-Yildiz
Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk

pdf bib
Preliminary Experiments with Amazon’s Mechanical Turk for Annotating Medical Named Entities
Meliha Yetisgen-Yildiz | Imre Solti | Fei Xia | Scott Halgrim
Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk