A Zero-Shot Monolingual Dual Stage Information Retrieval System for Spanish Biomedical Systematic Literature Reviews

Regina Ofori-Boateng, Magaly Aceves-Martins, Nirmalie Wiratunga, Carlos Moreno-Garcia


Abstract
Systematic Reviews (SRs) are foundational in healthcare for synthesising evidence to inform clinical practices. Traditionally skewed towards English-language databases, SRs often exclude significant research in other languages, leading to potential biases. This study addresses this gap by focusing on Spanish, a language notably underrepresented in SRs. We present a foundational zero-shot dual information retrieval (IR) baseline system, integrating traditional retrieval methods with pre-trained language models and cross-attention re-rankers for enhanced accuracy in Spanish biomedical literature retrieval. Utilising the LILACS database, known for its comprehensive coverage of Latin American and Caribbean biomedical literature, we evaluate the approach with three real-life case studies in Spanish SRs. The findings demonstrate the system’s efficacy and underscore the importance of query formulation. This study contributes to the field of IR by promoting language inclusivity and supports the development of more comprehensive and globally representative healthcare guidelines.
Anthology ID:
2024.naacl-long.206
Volume:
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Kevin Duh, Helena Gomez, Steven Bethard
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3725–3736
Language:
URL:
https://aclanthology.org/2024.naacl-long.206
DOI:
Bibkey:
Cite (ACL):
Regina Ofori-Boateng, Magaly Aceves-Martins, Nirmalie Wiratunga, and Carlos Moreno-Garcia. 2024. A Zero-Shot Monolingual Dual Stage Information Retrieval System for Spanish Biomedical Systematic Literature Reviews. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 3725–3736, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
A Zero-Shot Monolingual Dual Stage Information Retrieval System for Spanish Biomedical Systematic Literature Reviews (Ofori-Boateng et al., NAACL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.naacl-long.206.pdf
Copyright:
 2024.naacl-long.206.copyright.pdf