Brian Gallagher


2024

pdf bib
RankMean: Module-Level Importance Score for Merging Fine-tuned LLM Models
Gabriel Perin | Xuxi Chen | Shusen Liu | Bhavya Kailkhura | Zhangyang Wang | Brian Gallagher
Findings of the Association for Computational Linguistics: ACL 2024

Traditionally, developing new language models (LMs) capable of addressing multiple tasks involves fine-tuning pre-trained LMs using a wide collection of datasets, a process that often incurs significant computational expenses. Model merging emerges as a cost-effective alternative, allowing the integration of existing models fine-tuned on different tasks into a single model that performs well across all tasks, eliminating the need for additional training. In this paper, we propose RankMean, an algorithm for merging fine-tuned LMs without requiring any downstream data. RankMean determines merging coefficients based on the relative rankings of weight change magnitudes and applies these coefficients for module-wise integration of various fine-tuned models. Our experimental results demonstrate that RankMean outperforms existing baseline methods on multiple benchmarks. The code is available at https://github.com/VITA-Group/RankMean.