JPG - Jointly Learn to Align: Automated Disease Prediction and Radiology Report Generation

Jingyi You, Dongyuan Li, Manabu Okumura, Kenji Suzuki


Abstract
Automated radiology report generation aims to generate paragraphs that describe fine-grained visual differences among cases, especially those between the normal and the diseased. Existing methods seldom consider the cross-modal alignment between textual and visual features and tend to ignore disease tags as an auxiliary for report generation. To bridge the gap between textual and visual information, in this study, we propose a “Jointly learning framework for automated disease Prediction and radiology report Generation (JPG)” to improve the quality of reports through the interaction between the main task (report generation) and two auxiliary tasks (feature alignment and disease prediction). The feature alignment and disease prediction help the model learn text-correlated visual features and record diseases as keywords so that it can output high-quality reports. Besides, the improved reports in turn provide additional harder samples for feature alignment and disease prediction to learn more precise visual and textual representations and improve prediction accuracy. All components are jointly trained in a manner that helps improve them iteratively and progressively. Experimental results demonstrate the effectiveness of JPG on the most commonly used IU X-RAY dataset, showing its superior performance over multiple state-of-the-art image captioning and medical report generation methods with regard to BLEU, METEOR, and ROUGE metrics.
Anthology ID:
2022.coling-1.523
Volume:
Proceedings of the 29th International Conference on Computational Linguistics
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Editors:
Nicoletta Calzolari, Chu-Ren Huang, Hansaem Kim, James Pustejovsky, Leo Wanner, Key-Sun Choi, Pum-Mo Ryu, Hsin-Hsi Chen, Lucia Donatelli, Heng Ji, Sadao Kurohashi, Patrizia Paggio, Nianwen Xue, Seokhwan Kim, Younggyun Hahm, Zhong He, Tony Kyungil Lee, Enrico Santus, Francis Bond, Seung-Hoon Na
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
5989–6001
Language:
URL:
https://aclanthology.org/2022.coling-1.523
DOI:
Bibkey:
Cite (ACL):
Jingyi You, Dongyuan Li, Manabu Okumura, and Kenji Suzuki. 2022. JPG - Jointly Learn to Align: Automated Disease Prediction and Radiology Report Generation. In Proceedings of the 29th International Conference on Computational Linguistics, pages 5989–6001, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
Cite (Informal):
JPG - Jointly Learn to Align: Automated Disease Prediction and Radiology Report Generation (You et al., COLING 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.coling-1.523.pdf