Graham McDonald


2024

This paper presents the UoG Siephers team participation at the Discharge Me! Shared Task on Streamlining Discharge Documentation. For our participation, we investigate appropriately selecting and encoding specific sections of Electronic Health Records (EHR) as input data for sequence-to-sequence models, to generate the discharge instructions and brief hospital course sections of a patient’s EHR. We found that, despite the large volume of disparate information that is often available in EHRs, selectively choosing an appropriate EHR section for training and prompting sequence-to-sequence models resulted in improved generative quality. In particular, we found that using only the history of present illness section of an EHR as input often led to better performance than using multiple EHR sections.

2021

The relationships that exist between entities can be a reliable indicator for classifying sensitive information, such as commercially sensitive information. For example, the relation person-IsDirectorOf-company can indicate whether an individual’s salary should be considered as sensitive personal information. Representations of such relations are often learned using a knowledge graph to produce embeddings for relation types, generalised across different entity-pairs. However, a relation type may or may not correspond to a sensitivity depending on the entities that participate to the relation. Therefore, generalised relation embeddings are typically insufficient for classifying sensitive information. In this work, we propose a novel method for representing entities and relations within a single embedding to better capture the relationship between the entities. Moreover, we show that our proposed entity-relation-entity embedding approach can significantly improve (McNemar’s test, p <0.05) the effectiveness of sensitivity classification, compared to classification approaches that leverage relation embedding approaches from the literature. (0.426 F1 vs 0.413 F1)