Using Structured Health Information for Controlled Generation of Clinical Cases in French

Hugo Boulanger, Nicolas Hiebel, Olivier Ferret, Karën Fort, Aurélie Névéol


Abstract
Text generation opens up new prospects for overcoming the lack of open corpora in fields such as healthcare, where data sharing is bound by confidentiality. In this study, we compare the performance of encoder-decoder and decoder-only language models for the controlled generation of clinical cases in French. To do so, we fine-tuned several pre-trained models on French clinical cases for each architecture and generate clinical cases conditioned by patient demographic information (gender and age) and clinical features.Our results suggest that encoder-decoder models are easier to control than decoder-only models, but more costly to train.
Anthology ID:
2024.clinicalnlp-1.14
Volume:
Proceedings of the 6th Clinical Natural Language Processing Workshop
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Tristan Naumann, Asma Ben Abacha, Steven Bethard, Kirk Roberts, Danielle Bitterman
Venues:
ClinicalNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
172–184
Language:
URL:
https://aclanthology.org/2024.clinicalnlp-1.14
DOI:
10.18653/v1/2024.clinicalnlp-1.14
Bibkey:
Cite (ACL):
Hugo Boulanger, Nicolas Hiebel, Olivier Ferret, Karën Fort, and Aurélie Névéol. 2024. Using Structured Health Information for Controlled Generation of Clinical Cases in French. In Proceedings of the 6th Clinical Natural Language Processing Workshop, pages 172–184, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
Using Structured Health Information for Controlled Generation of Clinical Cases in French (Boulanger et al., ClinicalNLP-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.clinicalnlp-1.14.pdf