LoRA-drop: Efficient LoRA Parameter Pruning based on Output Evaluation

Hongyun Zhou, Xiangyu Lu, Wang Xu, Conghui Zhu, Tiejun Zhao, Muyun Yang


Abstract
Low-Rank Adaptation (LoRA) is currently the most commonly used Parameter-efficient fine-tuning (PEFT) method. However, it still faces high computational and storage costs to models with billions of parameters. Most previous studies have tackled this issue by using pruning techniques. Nonetheless, these efforts only analyze LoRA parameter features to evaluate their importance, such as parameter count, size, and gradient. In fact, the output of LoRA directly impacts the fine-tuned model. Preliminary experiments indicate that a fraction of LoRA possesses significantly high output values, substantially influencing the layer output. Motivated by the observation, we propose LoRA-drop. Concretely, LoRA-drop evaluates the importance of LoRA based on the LoRA output. Then we retain LoRA for important layers and the other layers share the same LoRA. We conduct abundant experiments with models of different scales on NLU and NLG tasks. Results demonstrate that LoRA-drop can achieve performance comparable to full fine-tuning and LoRA while retaining 50% of the LoRA parameters on average.
Anthology ID:
2025.coling-main.371
Volume:
Proceedings of the 31st International Conference on Computational Linguistics
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5530–5543
Language:
URL:
https://aclanthology.org/2025.coling-main.371/
DOI:
Bibkey:
Cite (ACL):
Hongyun Zhou, Xiangyu Lu, Wang Xu, Conghui Zhu, Tiejun Zhao, and Muyun Yang. 2025. LoRA-drop: Efficient LoRA Parameter Pruning based on Output Evaluation. In Proceedings of the 31st International Conference on Computational Linguistics, pages 5530–5543, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
LoRA-drop: Efficient LoRA Parameter Pruning based on Output Evaluation (Zhou et al., COLING 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.coling-main.371.pdf