UTSA-NLP at RadSum23: Multi-modal Retrieval-Based Chest X-Ray Report Summarization

Tongnian Wang, Xingmeng Zhao, Anthony Rios


Abstract
Radiology report summarization aims to automatically provide concise summaries of radiology findings, reducing time and errors in manual summaries. However, current methods solely summarize the text, which overlooks critical details in the images. Unfortunately, directly using the images in a multimodal model is difficult. Multimodal models are susceptible to overfitting due to their increased capacity, and modalities tend to overfit and generalize at different rates. Thus, we propose a novel retrieval-based approach that uses image similarities to generate additional text features. We further employ few-shot with chain-of-thought and ensemble techniques to boost performance. Overall, our method achieves state-of-the-art performance in the F1RadGraph score, which measures the factual correctness of summaries. We rank second place in both MIMIC-CXR and MIMIC-III hidden tests among 11 teams.
Anthology ID:
2023.bionlp-1.58
Volume:
The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Dina Demner-fushman, Sophia Ananiadou, Kevin Cohen
Venue:
BioNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
557–566
Language:
URL:
https://aclanthology.org/2023.bionlp-1.58
DOI:
10.18653/v1/2023.bionlp-1.58
Bibkey:
Cite (ACL):
Tongnian Wang, Xingmeng Zhao, and Anthony Rios. 2023. UTSA-NLP at RadSum23: Multi-modal Retrieval-Based Chest X-Ray Report Summarization. In The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks, pages 557–566, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
UTSA-NLP at RadSum23: Multi-modal Retrieval-Based Chest X-Ray Report Summarization (Wang et al., BioNLP 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.bionlp-1.58.pdf