IntactKV: Improving Large Language Model Quantization by Keeping Pivot Tokens Intact Ruikang Liu author Haoli Bai author Haokun Lin author Yuening Li author Han Gao author Zhengzhuo Xu author Lu Hou author Jun Yao author Chun Yuan author 2024-08 text Findings of the Association for Computational Linguistics: ACL 2024 Lun-Wei Ku editor Andre Martins editor Vivek Srikumar editor Association for Computational Linguistics Bangkok, Thailand conference publication liu-etal-2024-intactkv 10.18653/v1/2024.findings-acl.460 https://aclanthology.org/2024.findings-acl.460/ 2024-08 7716 7741