Large Scale Sequence-to-Sequence Models for Clinical Note Generation from Patient-Doctor Conversations

Gagandeep Singh, Yue Pan, Jesus Andres-Ferrer, Miguel Del-Agua, Frank Diehl, Joel Pinto, Paul Vozila


Abstract
We present our work on building large scale sequence-to-sequence models for generating clinical note from patient-doctor conversation. This is formulated as an abstractive summarization task for which we use encoder-decoder transformer model with pointer-generator. We discuss various modeling enhancements to this baseline model which include using subword and multiword tokenization scheme, prefixing the targets with a chain-of-clinical-facts, and training with contrastive loss that is defined over various candidate summaries. We also use flash attention during training and query chunked attention during inference to be able to process long input and output sequences and to improve computational efficiency. Experiments are conducted on a dataset containing about 900K encounters from around 1800 healthcare providers covering 27 specialties. The results are broken down into primary care and non-primary care specialties. Consistent accuracy improvements are observed across both of these categories.
Anthology ID:
2023.clinicalnlp-1.18
Volume:
Proceedings of the 5th Clinical Natural Language Processing Workshop
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Tristan Naumann, Asma Ben Abacha, Steven Bethard, Kirk Roberts, Anna Rumshisky
Venue:
ClinicalNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
138–143
Language:
URL:
https://aclanthology.org/2023.clinicalnlp-1.18
DOI:
10.18653/v1/2023.clinicalnlp-1.18
Bibkey:
Cite (ACL):
Gagandeep Singh, Yue Pan, Jesus Andres-Ferrer, Miguel Del-Agua, Frank Diehl, Joel Pinto, and Paul Vozila. 2023. Large Scale Sequence-to-Sequence Models for Clinical Note Generation from Patient-Doctor Conversations. In Proceedings of the 5th Clinical Natural Language Processing Workshop, pages 138–143, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Large Scale Sequence-to-Sequence Models for Clinical Note Generation from Patient-Doctor Conversations (Singh et al., ClinicalNLP 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.clinicalnlp-1.18.pdf
Video:
 https://aclanthology.org/2023.clinicalnlp-1.18.mp4