Concept-Based RAG Models: A High-Accuracy Fact Retrieval Approach

Cheng-Yu Lin, Jyh-Shing Jang


Abstract
This study introduces a concept-based methodology to optimize Retrieval-Augmented Generation (RAG) tasks by assessing dataset certainty using entropy-based metrics and concept extraction techniques. Unlike traditional methods focused on reducing LLM hallucinations or modifying data structures, this approach evaluates inherent knowledge uncertainty from an LLM perspective. By pre-processing documents with LLMs, the concept-based method significantly enhances precision in tasks demanding high accuracy, such as legal, finance, or formal document responses.
Anthology ID:
2025.finnlp-1.8
Volume:
Proceedings of the Joint Workshop of the 9th Financial Technology and Natural Language Processing (FinNLP), the 6th Financial Narrative Processing (FNP), and the 1st Workshop on Large Language Models for Finance and Legal (LLMFinLegal)
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Chung-Chi Chen, Antonio Moreno-Sandoval, Jimin Huang, Qianqian Xie, Sophia Ananiadou, Hsin-Hsi Chen
Venues:
FinNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
96–100
Language:
URL:
https://aclanthology.org/2025.finnlp-1.8/
DOI:
Bibkey:
Cite (ACL):
Cheng-Yu Lin and Jyh-Shing Jang. 2025. Concept-Based RAG Models: A High-Accuracy Fact Retrieval Approach. In Proceedings of the Joint Workshop of the 9th Financial Technology and Natural Language Processing (FinNLP), the 6th Financial Narrative Processing (FNP), and the 1st Workshop on Large Language Models for Finance and Legal (LLMFinLegal), pages 96–100, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
Concept-Based RAG Models: A High-Accuracy Fact Retrieval Approach (Lin & Jang, FinNLP 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.finnlp-1.8.pdf