NormaTex-MapSNOMED: Bridging the Gap Between Brazilian Portuguese Clinical Narratives and SNOMED CT

Isabela Araujo, Claudia Moro, Layslla Martinez


Abstract
Clinical narratives written in free text contain valuable information for patient care. However, their unstructured nature and linguistic variability pose significant challenges for automatic processing and interoperability. In particular, mapping clinical terms to standardized terminologies such as SNOMED Clinical Terms (SNOMED CT) remains difficult for languages other than English, including Brazilian Portuguese. This paper presents NormaTex-MapSNOMED, a proposed component of the NormaTex framework that focuses on mapping clinical terms to predefined categories aligned with SNOMED CT. Given previously extracted terms, the method leverages large language models (LLMs) guided by a structured prompt to assign terms to target categories. Experiments were conducted on Portuguese-language clinical narratives and evaluated using three complementary strategies: lexical similarity based on Levenshtein distance, contextual similarity using a BERT-based model, and semantic validation using LLMs. The results show that LLM-based evaluation consistently outperforms lexical and contextual baselines across different models, with higher precision observed for disease-related terms compared to symptom-related expressions. These findings indicate that LLMs are a promising approach for semantic mapping of clinical terms in Brazilian Portuguese and can support clinical term normalization and interoperability with standardized terminologies.
Anthology ID:
2026.propor-1.115
Volume:
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
Month:
April
Year:
2026
Address:
Salvador, Brazil
Editors:
Marlo Souza, Iria de-Dios-Flores, Diana Santos, Larissa Freitas, Jackson Wilke da Cruz Souza, Eugénio Ribeiro
Venue:
PROPOR
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1085–1091
Language:
URL:
https://aclanthology.org/2026.propor-1.115/
DOI:
Bibkey:
Cite (ACL):
Isabela Araujo, Claudia Moro, and Layslla Martinez. 2026. NormaTex-MapSNOMED: Bridging the Gap Between Brazilian Portuguese Clinical Narratives and SNOMED CT. In Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1, pages 1085–1091, Salvador, Brazil. Association for Computational Linguistics.
Cite (Informal):
NormaTex-MapSNOMED: Bridging the Gap Between Brazilian Portuguese Clinical Narratives and SNOMED CT (Araujo et al., PROPOR 2026)
Copy Citation:
PDF:
https://aclanthology.org/2026.propor-1.115.pdf