KEFT: Knowledge-Enhanced Fine-Tuning for Large Language Models in Domain-Specific Question Answering

Haiyun Li; Jixin Zhang; Hua Shen; Ke Cheng; Xiaofeng Huang

doi:10.1162/tacl.a.31

KEFT: Knowledge-Enhanced Fine-Tuning for Large Language Models in Domain-Specific Question Answering

Haiyun Li, Jixin Zhang, Hua Shen, Ke Cheng, Xiaofeng Huang

Abstract

The rapid advancement of large language models (LLMs) has opened up promising opportunities for their downstream applications in question-answering (QA), such as ChatGPT, ChatGLM, etc. However, such LLMs do not perform very well in domain-specific QA tasks without fine-tuning. But directly fine-tuning LLMs on domain-specific corpus data may lead to catastrophic forgetting, causing the LLMs to lose their general language capability. To address this problem, we propose the Knowledge-Enhanced Fine-Tuning (KEFT) method, an unsupervised fine-tuning approach to enhance the knowledge capability of LLMs in domain-specific QA tasks while preserving their general language capability. KEFT leverages the inherent language comprehension of pre-trained LLMs to generate synthetic-QA datasets from domain-specific corpus data autonomously for fine-tuning, and adopts a Low-Rank Adaptation (LoRA) method to further alleviate over-fitting. Furthermore, to enhance the representation of domain-specific knowledge, we introduce a knowledge-enhanced fine-tuning loss function, which encourages the model to learn the knowledge-question connection, thereby generating natural and knowledgeable answers. Our evaluations across multiple domain-specific datasets demonstrate that KEFT surpasses state-of-the-art fine-tuning approaches, enhancing the performance of various LLMs in QA tasks in both English and Chinese languages.

Anthology ID:: 2025.tacl-1.49
Volume:: Transactions of the Association for Computational Linguistics, Volume 13
Month:
Year:: 2025
Address:: Cambridge, MA
Venue:: TACL
SIG:
Publisher:: MIT Press
Note:
Pages:: 1056–1067
Language:
URL:: https://aclanthology.org/2025.tacl-1.49/
DOI:: 10.1162/tacl.a.31
Bibkey:
Cite (ACL):: Haiyun Li, Jixin Zhang, Hua Shen, Ke Cheng, and Xiaofeng Huang. 2025. KEFT: Knowledge-Enhanced Fine-Tuning for Large Language Models in Domain-Specific Question Answering. Transactions of the Association for Computational Linguistics, 13:1056–1067.
Cite (Informal):: KEFT: Knowledge-Enhanced Fine-Tuning for Large Language Models in Domain-Specific Question Answering (Li et al., TACL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.tacl-1.49.pdf

PDF Cite Search Fix data