Accelerating Learned Sparse Indexes Via Term Impact Decomposition

Joel Mackenzie, Antonio Mallia, Alistair Moffat, Matthias Petri


Abstract
Novel inverted index-based learned sparse ranking models provide more effective, but less efficient, retrieval performance compared to traditional ranking models like BM25. In this paper, we introduce a technique we call postings clipping to improve the query efficiency of learned representations. Our technique amplifies the benefit of dynamic pruning query processing techniques by accounting for changes in term importance distributions of learned ranking models. The new clipping mechanism accelerates top-k retrieval by up to 9.6X without any loss in effectiveness.
Anthology ID:
2022.findings-emnlp.205
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2022
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Editors:
Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2830–2842
Language:
URL:
https://aclanthology.org/2022.findings-emnlp.205
DOI:
10.18653/v1/2022.findings-emnlp.205
Bibkey:
Cite (ACL):
Joel Mackenzie, Antonio Mallia, Alistair Moffat, and Matthias Petri. 2022. Accelerating Learned Sparse Indexes Via Term Impact Decomposition. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 2830–2842, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
Accelerating Learned Sparse Indexes Via Term Impact Decomposition (Mackenzie et al., Findings 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.findings-emnlp.205.pdf