Fast and Accurate Fisher-Guided Quantization via Efficient Kronecker Factorization

Viktoriia A. Chekalina; Gerasin Timofey; Andrey Kuznetsov; Evgeny Frolov

Fast and Accurate Fisher-Guided Quantization via Efficient Kronecker Factorization

Viktoriia A. Chekalina, Gerasin Timofey, Andrey Kuznetsov, Evgeny Frolov

Abstract

Quantization has shown strong results in preserving model quality under compression. However, under aggressive bit-width reductions, even quantization may require additional information to prevent performance degradation. A natural source of it is second-order curvature information, captured by the Hessian. Since the Hessian of the model layers is prohibitively large, direct computation is infeasible, making structured parameterizations and approximations crucial in practice.In this work, we propose efficient Kronecker-factored approximation yielding state-of-the-art performance when integrated into existing quantization schemes. Evaluations on the LLaMA and Qwen model families show near-baseline quality at 4-bit compression and only a 5–6% degradation at 2-bit. Moreover, our method substantially accelerates the most expensive component in second-order quantization – Hessian parameterization – achieving up to a 10× speedup over prior approaches.

Anthology ID:: 2026.acl-long.1805
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 38925–38935
Language:
URL:: https://aclanthology.org/2026.acl-long.1805/
DOI:
Bibkey:
Cite (ACL):: Viktoriia A. Chekalina, Gerasin Timofey, Andrey Kuznetsov, and Evgeny Frolov. 2026. Fast and Accurate Fisher-Guided Quantization via Efficient Kronecker Factorization. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 38925–38935, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Fast and Accurate Fisher-Guided Quantization via Efficient Kronecker Factorization (Chekalina et al., ACL 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.acl-long.1805.pdf
Checklist:: 2026.acl-long.1805.checklist.pdf

PDF Cite Search Checklist Fix data