Multi-Facet Blending for Faceted Query-by-Example Retrieval

Heejin Do; Sangwon Ryu; Jonghwi Kim; Gary Lee

doi:10.18653/v1/2025.acl-long.1388

Multi-Facet Blending for Faceted Query-by-Example Retrieval

Heejin Do, Sangwon Ryu, Jonghwi Kim, Gary Lee

Abstract

With the growing demand to fit fine-grained user intents, faceted query-by-example (QBE), which retrieves similar documents conditioned on specific facets, has gained recent attention. However, prior approaches mainly depend on document-level comparisons using basic indicators like citations due to the lack of facet-level relevance datasets; yet, this limits their use to citation-based domains and fails to capture the intricacies of facet constraints. In this paper, we propose a multi-facet blending (FaBle) augmentation method, which exploits modularity by decomposing and recomposing to explicitly synthesize facet-specific training sets. We automatically decompose documents into facet units and generate (ir)relevant pairs by leveraging LLMs’ intrinsic distinguishing capabilities; then, dynamically recomposing the units leads to facet-wise relevance-informed document pairs. Our modularization eliminates the need for pre-defined facet knowledge or labels. Further, to prove the FaBle’s efficacy in a new domain beyond citation-based scientific paper retrieval, we release a benchmark dataset for educational exam item QBE. FaBle augmentation on 1K documents remarkably assists training in obtaining facet conditional embeddings.

Anthology ID:: 2025.acl-long.1388
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 28577–28590
Language:
URL:: https://aclanthology.org/2025.acl-long.1388/
DOI:: 10.18653/v1/2025.acl-long.1388
Bibkey:
Cite (ACL):: Heejin Do, Sangwon Ryu, Jonghwi Kim, and Gary Lee. 2025. Multi-Facet Blending for Faceted Query-by-Example Retrieval. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 28577–28590, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Multi-Facet Blending for Faceted Query-by-Example Retrieval (Do et al., ACL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.acl-long.1388.pdf

PDF Cite Search Fix data