SCV: Light and Effective Multi-Vector Retrieval with Sequence Compressive Vectors

Cheoneum Park, Seohyeong Jeong, Minsang Kim, KyungTae Lim, Yong-Hun Lee


Abstract
Recent advances in language models (LMs) has driven progress in information retrieval (IR), effectively extracting semantically relevant information. However, they face challenges in balancing computational costs with deeper query-document interactions. To tackle this, we present two mechanisms: 1) a light and effective multi-vector retrieval with sequence compression vectors, dubbed SCV and 2) coarse-to-fine vector search. The strengths of SCV stems from its application of span compressive vectors for scoring. By employing a non-linear operation to examine every token in the document, we abstract these into a span-level representation. These vectors effectively reduce the document’s dimensional representation, enabling the model to engage comprehensively with tokens across the entire collection of documents, rather than the subset retrieved by Approximate Nearest Neighbor. Therefore, our framework performs a coarse single vector search during the inference stage and conducts a fine-grained multi-vector search end-to-end. This approach effectively reduces the cost required for search. We empirically show that SCV achieves the fastest latency compared to other state-of-the-art models and can obtain competitive performance on both in-domain and out-of-domain benchmark datasets.
Anthology ID:
2025.coling-industry.63
Volume:
Proceedings of the 31st International Conference on Computational Linguistics: Industry Track
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert, Kareem Darwish, Apoorv Agarwal
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
760–770
Language:
URL:
https://aclanthology.org/2025.coling-industry.63/
DOI:
Bibkey:
Cite (ACL):
Cheoneum Park, Seohyeong Jeong, Minsang Kim, KyungTae Lim, and Yong-Hun Lee. 2025. SCV: Light and Effective Multi-Vector Retrieval with Sequence Compressive Vectors. In Proceedings of the 31st International Conference on Computational Linguistics: Industry Track, pages 760–770, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
SCV: Light and Effective Multi-Vector Retrieval with Sequence Compressive Vectors (Park et al., COLING 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.coling-industry.63.pdf