GraphTranslate: Predicting Clinical Trial Translation using Graph Neural Networks on Biomedical Literature

Emily Muller; Justin Boylan-Toomey; Jack Ekinsmyth; Arne Robben; María De La Paz Cardona; Antonia Langfelder

doi:10.18653/v1/2025.sdp-1.4

GraphTranslate: Predicting Clinical Trial Translation using Graph Neural Networks on Biomedical Literature

Emily Muller, Justin Boylan-Toomey, Jack Ekinsmyth, Arne Robben, María De La Paz Cardona, Antonia Langfelder

Abstract

The translation of basic science into clinical interventions represents a critical yet prolonged pathway in biomedical research, with significant implications for human health. While previous translation prediction approaches have focused on citation-based and metadata metrics or semantic analysis, the complex network structure of scientific knowledge remains under-explored. In this work, we present a novel graph neural network approach that leverages both semantic and structural information to predict which research publications will lead to clinical trials. Our model analyses a comprehensive dataset of 19 million publication nodes, using transformer-based title and abstract sentence embeddings within their citation network context. We demonstrate that our graph-based architecture, which employs attention mechanisms over local citation neighbourhoods, outperforms traditional convolutional approaches by effectively capturing knowledge flow patterns (F1 improvement of 4.5 and 3.5 percentage points for direct and indirect translation). Our metadata is carefully selected to eliminate potential biases from researcher-specific information, while maintaining predictive power through network structural features. Notably, our model achieves state-of-the-art performance using only content-based features, showing that language inherently captures many of the predictive features of translation. Through rigorous validation on a held-out time window (2021), we demonstrate generalisation across different biomedical domains and provide insights into early indicators of translational research potential. Our system offers immediate practical value for research funders, enabling evidence-based assessment of translational potential during grant review processes.

Anthology ID:: 2025.sdp-1.4
Volume:: Proceedings of the Fifth Workshop on Scholarly Document Processing (SDP 2025)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Tirthankar Ghosal, Philipp Mayr, Amanpreet Singh, Aakanksha Naik, Georg Rehm, Dayne Freitag, Dan Li, Sonja Schimmler, Anita De Waard
Venues:: sdp | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 31–41
Language:
URL:: https://aclanthology.org/2025.sdp-1.4/
DOI:: 10.18653/v1/2025.sdp-1.4
Bibkey:
Cite (ACL):: Emily Muller, Justin Boylan-Toomey, Jack Ekinsmyth, Arne Robben, María De La Paz Cardona, and Antonia Langfelder. 2025. GraphTranslate: Predicting Clinical Trial Translation using Graph Neural Networks on Biomedical Literature. In Proceedings of the Fifth Workshop on Scholarly Document Processing (SDP 2025), pages 31–41, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: GraphTranslate: Predicting Clinical Trial Translation using Graph Neural Networks on Biomedical Literature (Muller et al., sdp 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.sdp-1.4.pdf

PDF Cite Search Fix data