Self-Distilled Quantization: Achieving High Compression Rates in Transformer-Based Language Models James O’Neill author Sourav Dutta author 2023-07 text Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) Anna Rogers editor Jordan Boyd-Graber editor Naoaki Okazaki editor Association for Computational Linguistics Toronto, Canada conference publication oneill-dutta-2023-self 10.18653/v1/2023.acl-short.114 https://aclanthology.org/2023.acl-short.114/ 2023-07 1329 1339