Compensate Quantization Errors: Make Weights Hierarchical to Compensate Each Other

Yifei Gao, Jie Ou, Lei Wang, Yuting Xiao, Xiangzhiyuan Xiangzhiyuan, Ruiting Dai, Jun Cheng


Abstract
Emergent Large Language Models (LLMs) use their extraordinary performance and powerful deduction capacity to discern from traditional language models. However, the expenses of computational resources and storage for these LLMs are stunning, quantization then arises as a trending conversation. To address accuracy decay caused by quantization, two streams of works in post-training quantization methods stand out. One uses other weights to compensate existing quantization error, while the other transfers the quantization difficulty to other parts in the model. Combining both merits, we introduce Learnable Singular value Increment (LSI) as an advanced solution. LSI uses Singular Value Decomposition to extract singular values of the weights and make them learnable to help weights compensate each other conditioned on activation. Incorporating LSI with existing techniques, we achieve state-of-the-art performance in diverse quantization settings, no matter in weight-only, weight-activation or extremely low bit scenarios. By unleashing the potential of LSI, efficient finetuning on quantized model is no longer a prohibitive problem.
Anthology ID:
2024.findings-naacl.173
Volume:
Findings of the Association for Computational Linguistics: NAACL 2024
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Kevin Duh, Helena Gomez, Steven Bethard
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2711–2722
Language:
URL:
https://aclanthology.org/2024.findings-naacl.173
DOI:
10.18653/v1/2024.findings-naacl.173
Bibkey:
Cite (ACL):
Yifei Gao, Jie Ou, Lei Wang, Yuting Xiao, Xiangzhiyuan Xiangzhiyuan, Ruiting Dai, and Jun Cheng. 2024. Compensate Quantization Errors: Make Weights Hierarchical to Compensate Each Other. In Findings of the Association for Computational Linguistics: NAACL 2024, pages 2711–2722, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
Compensate Quantization Errors: Make Weights Hierarchical to Compensate Each Other (Gao et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-naacl.173.pdf