HIL: Hybrid Isotropy Learning for Zero-shot Performance in Dense retrieval

Jaeyoung Kim; Dohyeon Lee; Seung-won Hwang

doi:10.18653/v1/2024.naacl-long.437

HIL: Hybrid Isotropy Learning for Zero-shot Performance in Dense retrieval

Jaeyoung Kim, Dohyeon Lee, Seung-won Hwang

Abstract

Advancements in dense retrieval models have brought ColBERT to prominence in Information Retrieval (IR) with its advanced interaction techniques.However, ColBERT is reported to frequently underperform in zero-shot scenarios, where traditional techniques such as BM25 still exceed it.Addressing this, we propose to balance representation isotropy and anisotropy for zero-shot model performance, based on our observations that isotropy can enhance cosine similarity computations and anisotropy may aid in generalizing to unseen data.Striking a balance between these isotropic and anisotropic qualities stands as a critical objective to refine model efficacy.Based on this, we present ours, a Hybrid Isotropy Learning (HIL) architecture that integrates isotropic and anisotropic representations.Our experiments with the BEIR benchmark show that our model significantly outperforms the baseline ColBERT model, highlighting the importance of harmonized isotropy in improving zero-shot retrieval performance.

Anthology ID:: 2024.naacl-long.437
Volume:: Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Month:: June
Year:: 2024
Address:: Mexico City, Mexico
Editors:: Kevin Duh, Helena Gomez, Steven Bethard
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 7892–7903
Language:
URL:: https://aclanthology.org/2024.naacl-long.437
DOI:: 10.18653/v1/2024.naacl-long.437
Bibkey:
Cite (ACL):: Jaeyoung Kim, Dohyeon Lee, and Seung-won Hwang. 2024. HIL: Hybrid Isotropy Learning for Zero-shot Performance in Dense retrieval. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 7892–7903, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):: HIL: Hybrid Isotropy Learning for Zero-shot Performance in Dense retrieval (Kim et al., NAACL 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.naacl-long.437.pdf

PDF Cite Search