Harnessing Open-Source LLMs for Tender Named Entity Recognition

Asim Abbas, Venelin Kovatchev, Mark Lee, Niloofer Shanavas, Mubashir Ali


Abstract
In the public procurement domain, extracting accurate tender entities from unstructured text remains a critical, less explored challenge, because tender data is highly sensitive and confidential, and not available openly. Previously, state-of-the-art NLP models were developed for this task; however developing an NER model from scratch required huge amounts of data and resources. Similarly, performing fine-tuning of a transformer-based model like BERT requires training data, as a result posing challenges in training data cost, model generalization, and data privacy. To address these challenges, an emerging LLM such as GPT-4 in a Few-shot learning environment achieves SOTA performance comparable to fine-tuned models. However, being dependent on the closed-source commercial LLMs involves high cost and privacy concerns. In this study, we have investigated open-source LLMs like Mistral and LLAMA-3, focusing on the tender domain for the NER tasks on local consumer-grade CPUs in three different environments: Zero-shot, One-shot, and Few-shot learning. The motivation is to efficiently lessen costs compared to a cloud solution while preserving accuracy and data privacy. Similarly, we have utilized two datasets open-source from Singapore and closed-source commercially sensitive data provided by Siemens. As a result, all the open-source LLMs achieve above 85% F1-score on an open-source dataset and above 90% F1-score on a closed-source dataset.
Anthology ID:
2025.ranlp-1.1
Volume:
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era
Month:
September
Year:
2025
Address:
Varna, Bulgaria
Editors:
Galia Angelova, Maria Kunilovskaya, Marie Escribe, Ruslan Mitkov
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd., Shoumen, Bulgaria
Note:
Pages:
1–10
Language:
URL:
https://aclanthology.org/2025.ranlp-1.1/
DOI:
Bibkey:
Cite (ACL):
Asim Abbas, Venelin Kovatchev, Mark Lee, Niloofer Shanavas, and Mubashir Ali. 2025. Harnessing Open-Source LLMs for Tender Named Entity Recognition. In Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era, pages 1–10, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
Cite (Informal):
Harnessing Open-Source LLMs for Tender Named Entity Recognition (Abbas et al., RANLP 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.ranlp-1.1.pdf