Why These Documents? Explainable Generative Retrieval with Hierarchical Category Paths

Sangam Lee; Ryang Heo; SeongKu Kang; Susik Yoon; Jinyoung Yeo; Dongha Lee

Why These Documents? Explainable Generative Retrieval with Hierarchical Category Paths

Sangam Lee, Ryang Heo, SeongKu Kang, Susik Yoon, Jinyoung Yeo, Dongha Lee

Abstract

Generative retrieval directly decode a document identifier (i.e., docid) in response to a query, making it impossible to provide users with explanations as an answer for “why is this document retrieved?”. To address this limitation, we propose Hierarchical Category Path-Enhanced Generative Retrieval (HyPE), which enhances explainability by first generating hierarchical category paths step-by-step then decoding docid. By leveraging hierarchical category paths which progress from broader to more specific semantic categories, HyPE can provide detailed explanation for its retrieval decision. For training, HyPE constructs category paths with external high-quality semantic hierarchy, leverages LLM to select appropriate candidate paths for each document, and optimizes the generative retrieval model with path-augmented dataset. During inference, HyPE utilizes path-aware ranking strategy to aggregate diverse topic information, allowing the most relevant documents to be prioritized in the final ranked list of docids. Our extensive experiments demonstrate that HyPE not only offers a high level of explainability but also improves the retrieval performance. We provide the code and a live demo of HyPE at https://augustinlib.github.io/HyPE/

Anthology ID:: 2026.findings-acl.1097
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 21807–21824
Language:
URL:: https://aclanthology.org/2026.findings-acl.1097/
DOI:
Bibkey:
Cite (ACL):: Sangam Lee, Ryang Heo, SeongKu Kang, Susik Yoon, Jinyoung Yeo, and Dongha Lee. 2026. Why These Documents? Explainable Generative Retrieval with Hierarchical Category Paths. In Findings of the Association for Computational Linguistics: ACL 2026, pages 21807–21824, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Why These Documents? Explainable Generative Retrieval with Hierarchical Category Paths (Lee et al., Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-acl.1097.pdf
Checklist:: 2026.findings-acl.1097.checklist.pdf

PDF Cite Search Checklist Fix data