Hongfang Liu


2025

Responding to patient portal messages places a substantial burden on clinicians. To mitigate this, automatically generating answers to patient questions by considering their medical records is a critical solution. In this study, we proposed a clinical question answering system for the BioNLP 2025 Shared Task on Grounded Electronic Health Record Question Answering. The system processed each patient message case by selecting relevant sentences as evidences from the associated clinical notes and generating a concise, medically accurate answer to the patient’s question. A generative AI model from OpenAI (GPT-4o) was leveraged to assist with sentence selection and answer generation. Each response is grounded in source text, limited to 75 words, and includes sentence-level citations. The system was evaluated on 100 test cases using alignment, citation, and summarization metrics. Our results indicate the significant potential of the clinical question answering system based on generative AI models to streamline communication between patients and healthcare providers by automatically generating responses to patient messages.

2024

Extracting timeline information from clinical narratives is critical for cancer research and practice using electronic health records (EHRs). In this study, we apply MedTimeline, our end-to-end hybrid NLP system combining large language model, deep learning with knowledge engineering, to the ChemoTimeLine challenge subtasks. Our experiment results in 0.83, 0.90, 0.84, and 0.53, 0.63, 0.39, respectively, for subtask1 and subtask2 in breast, melanoma and ovarian cancer.

2019

Neural network models have shown promise in the temporal relation extraction task. In this paper, we present the attention based neural network model to extract the containment relations within sentences from clinical narratives. The attention mechanism used on top of GRU model outperforms the existing state-of-the-art neural network models on THYME corpus in intra-sentence temporal relation extraction.

2017

In this paper, we present MayoNLP’s results from the participation in the ScienceIE share task at SemEval 2017. We focused on the keyphrase classification task (Subtask B). We explored semantic similarities and patterns of keyphrases in scientific publications using pre-trained word embedding models. Word Embedding Distance Pattern, which uses the head noun word embedding to generate distance patterns based on labeled keyphrases, is proposed as an incremental feature set to enhance the conventional Named Entity Recognition feature sets. Support vector machine is used as the supervised classifier for keyphrase classification. Our system achieved an overall F1 score of 0.67 for keyphrase classification and 0.64 for keyphrase classification and relation detection.

2016

Domain-specific annotations for NLP are often centered on real-world applications of text, and incorrect annotations may be particularly unacceptable. In medical text, the process of manual chart review (of a patient’s medical record) is error-prone due to its complexity. We propose a staggered NLP-assisted approach to the refinement of clinical annotations, an interactive process that allows initial human judgments to be verified or falsified by means of comparison with an improving NLP system. We show on our internal Asthma Timelines dataset that this approach improves the quality of the human-produced clinical annotations.
Privacy concerns have often served as an insurmountable barrier for the production of research and resources in clinical information retrieval (IR). We believe that both clinical IR research innovation and legitimate privacy concerns can be served by the creation of intra-institutional, fully protected resources. In this paper, we provide some principles and tools for IR resource-building in the unique problem setting of patient-level IR, following the tradition of the Cranfield paradigm.

2015

2013

2005

2004