Fast Vocabulary Transfer for Language Model Compression

Leonidas Gee, Andrea Zugarini, Leonardo Rigutini, Paolo Torroni


Abstract
Real-world business applications require a trade-off between language model performance and size. We propose a new method for model compression that relies on vocabulary transfer. We evaluate the method on various vertical domains and downstream tasks. Our results indicate that vocabulary transfer can be effectively used in combination with other compression techniques, yielding a significant reduction in model size and inference time while marginally compromising on performance.
Anthology ID:
2022.emnlp-industry.41
Volume:
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Industry Track
Month:
December
Year:
2022
Address:
Abu Dhabi, UAE
Editors:
Yunyao Li, Angeliki Lazaridou
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
409–416
Language:
URL:
https://aclanthology.org/2022.emnlp-industry.41
DOI:
10.18653/v1/2022.emnlp-industry.41
Bibkey:
Cite (ACL):
Leonidas Gee, Andrea Zugarini, Leonardo Rigutini, and Paolo Torroni. 2022. Fast Vocabulary Transfer for Language Model Compression. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Industry Track, pages 409–416, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
Fast Vocabulary Transfer for Language Model Compression (Gee et al., EMNLP 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.emnlp-industry.41.pdf