Know the Known and the Unknown: Reasonable Answer Generation with Knowledge-Informed Citations

Yichi Zhang; Zhuo Chen; Lingbing Guo; Jun Xu; Mengshu Sun; Zhizhen Liu; Lei Liang; Wen Zhang; Huajun Chen

Know the Known and the Unknown: Reasonable Answer Generation with Knowledge-Informed Citations

Yichi Zhang, Zhuo Chen, Lingbing Guo, Jun Xu, Mengshu Sun, Zhizhen Liu, Lei Liang, Wen Zhang, Huajun Chen

Abstract

Question answering (QA) with reference texts is a classic application scenario for large language models (LLMs), where high standards for the credibility and traceability of generated answers are crucial. Many existing approaches focus on generating multi-level citations linked to specific references within the answer, making it verifiable and trustworthy. However, they often overlook key challenges such as citation granularity, the awareness of unknown information, and the adoption of effective training strategies. In this paper, we introduce Knowledge-informed Citation (KFC), which addresses these issues through a novel data construction pipeline, a new benchmark, and an innovative training strategy. With approximately 42K samples spanning 19 distinct domains, KFC includes both traditional citations referencing known entity-level information and specialized citations referring to unknown knowledge in the given question. This structure provides a more granular approach to citations, guiding the model to recognize and explicitly indicate unknown information, thus enhancing the quality and credibility of the response. Additionally, we propose a self-correction paradigm, Self-KFC, designed to fine-tune LLMs by refining poorly cited answers into more accurate ones, making it particularly suitable for citation-dependent scenarios. We present comprehensive experimental results to demonstrate the effectiveness and generalization of Self-KFC on the KFC benchmark.

Anthology ID:: 2026.acl-long.867
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 18996–19011
Language:
URL:: https://aclanthology.org/2026.acl-long.867/
DOI:
Bibkey:
Cite (ACL):: Yichi Zhang, Zhuo Chen, Lingbing Guo, Jun Xu, Mengshu Sun, Zhizhen Liu, Lei Liang, Wen Zhang, and Huajun Chen. 2026. Know the Known and the Unknown: Reasonable Answer Generation with Knowledge-Informed Citations. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 18996–19011, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Know the Known and the Unknown: Reasonable Answer Generation with Knowledge-Informed Citations (Zhang et al., ACL 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.acl-long.867.pdf
Checklist:: 2026.acl-long.867.checklist.pdf

PDF Cite Search Checklist Fix data