Revisiting Block-based Quantisation: What is Important for Sub-8-bit LLM Inference? Cheng Zhang author Jianyi Cheng author Ilia Shumailov author George Constantinides author Yiren Zhao author 2023-12 text Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing Houda Bouamor editor Juan Pino editor Kalika Bali editor Association for Computational Linguistics Singapore conference publication zhang-etal-2023-revisiting 10.18653/v1/2023.emnlp-main.617 https://aclanthology.org/2023.emnlp-main.617/ 2023-12 9988 10006