QuARK: LLM-Based Domain-Specific Question Answering Using Retrieval Augmented Generation and Knowledge Graphs

Edward Burgin, Sourav Dutta, Mingxue Wang


Abstract
Retrieval Augmented Generation (RAG) has been pivotal in the utilization of Large Language Models (LLM) to improve the factuality of long-form question answering systems in industrial settings. Knowledge graphs (KG) represent a linking of disparate information sources that potentially yield useful information for mitigating the issues of insufficient knowledge and hallucination within the LLM-RAG pipeline. However, the creation of domain-specific KG is costly and usually requires a domain expert. To alleviate the above challenges, this work proposes QuARK, a novel domain-specific question answering framework to enhance the knowledge capabilities of LLM by integrating structured KG, thereby significantly reducing the reliance on the “generic” latent knowledge of LLMs. Here, we showcase how LLMs can be deployed to not only act in dynamic information retrieval and in answer generating frameworks, but also as flexible agents to automatically extract relevant entities and relations for the automated construction of domain-specific KGs. Crucially we propose how the pairing of question decomposition and semantic triplet retrieval within RAG can enable optimal subgraph retrieval. Experimental evaluations of our framework on financial domain public dataset, demonstrate that it enables a robust pipeline incorporating schema-free KG within a RAG framework to improve the overall accuracy by nearly 13%.
Anthology ID:
2025.ranlp-1.25
Volume:
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era
Month:
September
Year:
2025
Address:
Varna, Bulgaria
Editors:
Galia Angelova, Maria Kunilovskaya, Marie Escribe, Ruslan Mitkov
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd., Shoumen, Bulgaria
Note:
Pages:
210–217
Language:
URL:
https://aclanthology.org/2025.ranlp-1.25/
DOI:
Bibkey:
Cite (ACL):
Edward Burgin, Sourav Dutta, and Mingxue Wang. 2025. QuARK: LLM-Based Domain-Specific Question Answering Using Retrieval Augmented Generation and Knowledge Graphs. In Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era, pages 210–217, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
Cite (Informal):
QuARK: LLM-Based Domain-Specific Question Answering Using Retrieval Augmented Generation and Knowledge Graphs (Burgin et al., RANLP 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.ranlp-1.25.pdf