Santhosh Cherian


pdf bib
Data Augmentation for Radiology Report Simplification
Ziyu Yang | Santhosh Cherian | Slobodan Vucetic
Findings of the Association for Computational Linguistics: EACL 2023

This work considers the development of a text simplification model to help patients better understand their radiology reports. This paper proposes a data augmentation approach to address the data scarcity issue caused by the high cost of manual simplification. It prompts a large foundational pre-trained language model to generate simplifications of unlabeled radiology sentences. In addition, it uses paraphrasing of labeled radiology sentences. Experimental results show that the proposed data augmentation approach enables the training of a significantly more accurate simplification model than the baselines.