SINAI at BioLaySumm: Self-Play Fine-Tuning of Large Language Models for Biomedical Lay Summarisation

Mariia Chizhikova, Manuel Carlos Díaz-Galiano, L. Alfonso Ureña-López, María-Teresa Martín-Valdivia


Abstract
An effective disclosure of scientific knowledge and advancements to the general public is often hindered by the complexity of the technical language used in research which often results very difficult, if not impossible, for non-experts to understand. In this paper we present the approach developed by the SINAI team as the result of our participation in BioLaySumm shared task hosted by the BioNLP workshop at ACL 2024. Our approach stems from the experimentation we performed in order to test the ability of state-of-the-art pre-trained large language models, namely GPT 3.5, GPT 4 and Llama-3, to tackle this task in a few-shot manner. In order to improve this baseline, we opted for fine-tuning Llama-3 by applying parameter-efficient methodologies. The best performing system which resulted from applying self-play fine tuning method which allows the model to improve while learning to distinguish between its own generations from the previous step from the gold standard summaries. This approach achieved 0.4205 ROUGE-1 score and 0.8583 BERTScore.
Anthology ID:
2024.bionlp-1.74
Volume:
Proceedings of the 23rd Workshop on Biomedical Natural Language Processing
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Dina Demner-Fushman, Sophia Ananiadou, Makoto Miwa, Kirk Roberts, Junichi Tsujii
Venues:
BioNLP | WS
SIG:
SIGBIOMED
Publisher:
Association for Computational Linguistics
Note:
Pages:
804–809
Language:
URL:
https://aclanthology.org/2024.bionlp-1.74
DOI:
Bibkey:
Cite (ACL):
Mariia Chizhikova, Manuel Carlos Díaz-Galiano, L. Alfonso Ureña-López, and María-Teresa Martín-Valdivia. 2024. SINAI at BioLaySumm: Self-Play Fine-Tuning of Large Language Models for Biomedical Lay Summarisation. In Proceedings of the 23rd Workshop on Biomedical Natural Language Processing, pages 804–809, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
SINAI at BioLaySumm: Self-Play Fine-Tuning of Large Language Models for Biomedical Lay Summarisation (Chizhikova et al., BioNLP-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.bionlp-1.74.pdf