Lossless Acceleration of Large Language Model via Adaptive N-gram Parallel Decoding Jie Ou author Yueming Chen author Prof. Tian author 2024-06 text Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 6: Industry Track) Yi Yang editor Aida Davani editor Avi Sil editor Anoop Kumar editor Association for Computational Linguistics Mexico City, Mexico conference publication ou-etal-2024-lossless 10.18653/v1/2024.naacl-industry.2 https://aclanthology.org/2024.naacl-industry.2/ 2024-06 10 22