Can Post-Training Quantization Benefit from an Additional QLoRA Integration?

Xiliang Zhu; Elena Khasanova; Cheng Chen

doi:10.18653/v1/2025.naacl-industry.41

Can Post-Training Quantization Benefit from an Additional QLoRA Integration?

Xiliang Zhu, Elena Khasanova, Cheng Chen

Abstract

Large language models (LLMs) have transformed natural language processing but pose significant challenges for real-world deployment. These models necessitate considerable computing resources, which can be costly and frequently unavailable. Model compression techniques such as quantization are often leveraged to alleviate resource demand, but they may have a negative impact on the generation quality. In this study, we explore the integration of 4-bit Post-training Quantization (PTQ) with QLoRA to address these issues. We demonstrate through extensive experiments that this integration outperforms standard PTQ, and in some cases even 16-bit full-parameter fine-tuning on LLMs, validated across proprietary and public datasets with different quantization algorithms. The results demonstrate the efficacy of PTQ-QLoRA integration, offering a viable solution for deploying powerful LLMs in resource-constrained environments without compromising on performance.

Anthology ID:: 2025.naacl-industry.41
Volume:: Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 3: Industry Track)
Month:: April
Year:: 2025
Address:: Albuquerque, New Mexico
Editors:: Weizhu Chen, Yi Yang, Mohammad Kachuee, Xue-Yong Fu
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 506–514
Language:
URL:: https://aclanthology.org/2025.naacl-industry.41/
DOI:: 10.18653/v1/2025.naacl-industry.41
Bibkey:
Cite (ACL):: Xiliang Zhu, Elena Khasanova, and Cheng Chen. 2025. Can Post-Training Quantization Benefit from an Additional QLoRA Integration?. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 3: Industry Track), pages 506–514, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):: Can Post-Training Quantization Benefit from an Additional QLoRA Integration? (Zhu et al., NAACL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.naacl-industry.41.pdf

PDF Cite Search Fix data