Matryoshka-Adaptor: Unsupervised and Supervised Tuning for Smaller Embedding Dimensions

Jinsung Yoon, Rajarishi Sinha, Sercan O Arik, Tomas Pfister


Abstract
Embeddings from Large Language Models (LLMs) have emerged as critical components in various applications, particularly for information retrieval. While high-dimensional embeddings generally demonstrate superior performance as they contain more salient information, their practical application is frequently hindered by elevated computational latency and the associated higher cost. To address these challenges, we propose Matryoshka-Adaptor, a novel tuning framework designed for the customization of LLM embeddings. Matryoshka-Adaptor facilitates substantial dimensionality reduction while maintaining comparable performance levels, thereby achieving a significant enhancement in computational efficiency and cost-effectiveness. Our framework directly modifies the embeddings from pre-trained LLMs which is designed to be seamlessly integrated with any LLM architecture, encompassing those accessible exclusively through black-box APIs. Also, it exhibits efficacy in both unsupervised and supervised learning settings. A rigorous evaluation conducted across a diverse corpus of English, multilingual, and multimodal datasets consistently reveals substantial gains with Matryoshka-Adaptor. Notably, with Google and OpenAI Embedding APIs, Matryoshka-Adaptor achieves a reduction in dimensionality ranging from two- to twelve-fold without compromising performance across multiple BEIR datasets.
Anthology ID:
2024.emnlp-main.576
Volume:
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
10318–10336
Language:
URL:
https://aclanthology.org/2024.emnlp-main.576
DOI:
10.18653/v1/2024.emnlp-main.576
Bibkey:
Cite (ACL):
Jinsung Yoon, Rajarishi Sinha, Sercan O Arik, and Tomas Pfister. 2024. Matryoshka-Adaptor: Unsupervised and Supervised Tuning for Smaller Embedding Dimensions. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 10318–10336, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Matryoshka-Adaptor: Unsupervised and Supervised Tuning for Smaller Embedding Dimensions (Yoon et al., EMNLP 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.emnlp-main.576.pdf