Attend to Medical Ontologies: Content Selection for Clinical Abstractive Summarization

Sajad Sotudeh Gharebagh, Nazli Goharian, Ross Filice


Abstract
Sequence-to-sequence (seq2seq) network is a well-established model for text summarization task. It can learn to produce readable content; however, it falls short in effectively identifying key regions of the source. In this paper, we approach the content selection problem for clinical abstractive summarization by augmenting salient ontological terms into the summarizer. Our experiments on two publicly available clinical data sets (107,372 reports of MIMIC-CXR, and 3,366 reports of OpenI) show that our model statistically significantly boosts state-of-the-art results in terms of ROUGE metrics (with improvements: 2.9% RG-1, 2.5% RG-2, 1.9% RG-L), in the healthcare domain where any range of improvement impacts patients’ welfare.
Anthology ID:
2020.acl-main.172
Volume:
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2020
Address:
Online
Editors:
Dan Jurafsky, Joyce Chai, Natalie Schluter, Joel Tetreault
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1899–1905
Language:
URL:
https://aclanthology.org/2020.acl-main.172
DOI:
10.18653/v1/2020.acl-main.172
Bibkey:
Cite (ACL):
Sajad Sotudeh Gharebagh, Nazli Goharian, and Ross Filice. 2020. Attend to Medical Ontologies: Content Selection for Clinical Abstractive Summarization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 1899–1905, Online. Association for Computational Linguistics.
Cite (Informal):
Attend to Medical Ontologies: Content Selection for Clinical Abstractive Summarization (Sotudeh Gharebagh et al., ACL 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.acl-main.172.pdf
Video:
 http://slideslive.com/38929326