Generating Continuous Representations of Medical Texts

Graham Spinks, Marie-Francine Moens


Abstract
We present an architecture that generates medical texts while learning an informative, continuous representation with discriminative features. During training the input to the system is a dataset of captions for medical X-Rays. The acquired continuous representations are of particular interest for use in many machine learning techniques where the discrete and high-dimensional nature of textual input is an obstacle. We use an Adversarially Regularized Autoencoder to create realistic text in both an unconditional and conditional setting. We show that this technique is applicable to medical texts which often contain syntactic and domain-specific shorthands. A quantitative evaluation shows that we achieve a lower model perplexity than a traditional LSTM generator.
Anthology ID:
N18-5014
Volume:
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations
Month:
June
Year:
2018
Address:
New Orleans, Louisiana
Editors:
Yang Liu, Tim Paek, Manasi Patwardhan
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
66–70
Language:
URL:
https://aclanthology.org/N18-5014/
DOI:
10.18653/v1/N18-5014
Bibkey:
Cite (ACL):
Graham Spinks and Marie-Francine Moens. 2018. Generating Continuous Representations of Medical Texts. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, pages 66–70, New Orleans, Louisiana. Association for Computational Linguistics.
Cite (Informal):
Generating Continuous Representations of Medical Texts (Spinks & Moens, NAACL 2018)
Copy Citation:
PDF:
https://aclanthology.org/N18-5014.pdf