HBUT at #SMM4H 2024 Task2: Cross-lingual Few-shot Medical Entity Extraction using a Large Language Model

Yuanzhi Ke, Zhangju Yin, Xinyun Wu, Caiquan Xiong


Abstract
Named entity recognition (NER) of drug and disorder/body function mentions in web text is challenging in the face of multilingualism, limited data, and poor data quality. Traditional small-scale models struggle to cope with the task. Large language models with conventional prompts also yield poor results. In this paper, we introduce our system, which employs a large language model (LLM) with a novel two-step prompting strategy. Instead of directly extracting the target medical entities, our system firstly extract all entities and then prompt the LLM to extract drug and disorder entities given the all-entity list and original input text as the context. The experimental and test results indicate that this strategy successfully enhanced our system performance, especially for German language.
Anthology ID:
2024.smm4h-1.13
Volume:
Proceedings of The 9th Social Media Mining for Health Research and Applications (SMM4H 2024) Workshop and Shared Tasks
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Dongfang Xu, Graciela Gonzalez-Hernandez
Venues:
SMM4H | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
58–62
Language:
URL:
https://aclanthology.org/2024.smm4h-1.13
DOI:
Bibkey:
Cite (ACL):
Yuanzhi Ke, Zhangju Yin, Xinyun Wu, and Caiquan Xiong. 2024. HBUT at #SMM4H 2024 Task2: Cross-lingual Few-shot Medical Entity Extraction using a Large Language Model. In Proceedings of The 9th Social Media Mining for Health Research and Applications (SMM4H 2024) Workshop and Shared Tasks, pages 58–62, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
HBUT at #SMM4H 2024 Task2: Cross-lingual Few-shot Medical Entity Extraction using a Large Language Model (Ke et al., SMM4H-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.smm4h-1.13.pdf