Beyond Abstracts: A New Dataset, Prompt Design Strategy and Method for Biomedical Synthesis Generation

James O’Doherty, Cian Nolan, Yufang Hou, Anya Belz


Abstract
The biomedical field relies on cost and time intensive systematic reviews of papers to enable practitioners to keep up to date with research. Impressive recent advances in large language models (LLMs) have made the task of automating at least part of the systematic review process feasible, but progress is slow. This paper identifies some factors that may have been holding research back, and proposes a new, enhanced dataset and prompting-based method for automatic synthesis generation, the most challenging step for automation. We test different models and types of information from and about biomedical studies for their usefulness in obtaining high-quality results.We find that, surprisingly, inclusion of paper abstracts can worsens results. Instead, study summary information, and system instructions informed by domain knowledge, are key to producing high-quality syntheses.
Anthology ID:
2024.acl-srw.42
Volume:
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop)
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Xiyan Fu, Eve Fleisig
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
499–518
Language:
URL:
https://aclanthology.org/2024.acl-srw.42
DOI:
Bibkey:
Cite (ACL):
James O’Doherty, Cian Nolan, Yufang Hou, and Anya Belz. 2024. Beyond Abstracts: A New Dataset, Prompt Design Strategy and Method for Biomedical Synthesis Generation. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop), pages 499–518, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
Beyond Abstracts: A New Dataset, Prompt Design Strategy and Method for Biomedical Synthesis Generation (O’Doherty et al., ACL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.acl-srw.42.pdf