FanLoRA: Fantastic LoRAs and Where to Find Them in Large Language Model Fine-tuning

Aaron Xuxiang Tian; Yi Zhao; Congrui Yin; Wei Zhu; Xing Tian; Yi Ge

doi:10.18653/v1/2024.emnlp-industry.38

FanLoRA: Fantastic LoRAs and Where to Find Them in Large Language Model Fine-tuning

Aaron Xuxiang Tian, Yi Zhao, Congrui Yin, Wei Zhu, Xing Tian, Yi Ge

Abstract

Full-parameter fine-tuning is computationally prohibitive for large language models (LLMs), making parameter-efficient fine-tuning (PEFT) methods like low-rank adaptation (LoRA) increasingly popular. However, LoRA and its existing variants introduce significant latency in multi-tenant settings, hindering their applications in the industry. To address this issue, we propose the Fantastic LoRA (FanLoRA) framework, which consists of four steps: (a) adding LoRA modules to all the Transformer linear weights and fine-tuning on a large-scale instruction tuning dataset. (b) The importance of each module is then assessed using a novel importance scoring method. (c) only the most critical modules per layer are retained, resulting in the FanLoRA setting. (d) The FanLoRA setting is applied to fine-tune various downstream tasks. Our extensive experiments demonstrate that: (a) FanLoRA outperforms existing PEFT baselines across a wide collection of tasks with comparable tunable parameters. (b) FanLoRA significantly reduces the inference latency of LoRA, making it valuable for further broadening the applications of LLMs in the industry.

Anthology ID:: 2024.emnlp-industry.38
Volume:: Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track
Month:: November
Year:: 2024
Address:: Miami, Florida, US
Editors:: Franck Dernoncourt, Daniel Preoţiuc-Pietro, Anastasia Shimorina
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 515–528
Language:
URL:: https://aclanthology.org/2024.emnlp-industry.38/
DOI:: 10.18653/v1/2024.emnlp-industry.38
Bibkey:
Cite (ACL):: Aaron Xuxiang Tian, Yi Zhao, Congrui Yin, Wei Zhu, Xing Tian, and Yi Ge. 2024. FanLoRA: Fantastic LoRAs and Where to Find Them in Large Language Model Fine-tuning. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track, pages 515–528, Miami, Florida, US. Association for Computational Linguistics.
Cite (Informal):: FanLoRA: Fantastic LoRAs and Where to Find Them in Large Language Model Fine-tuning (Tian et al., EMNLP 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.emnlp-industry.38.pdf

PDF Cite Search Fix data