Brian Benedict


2024

pdf bib
Arcee’s MergeKit: A Toolkit for Merging Large Language Models
Charles Goddard | Shamane Siriwardhana | Malikeh Ehghaghi | Luke Meyers | Vladimir Karpukhin | Brian Benedict | Mark McQuade | Jacob Solawetz
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track

The rapid growth of open-source language models provides the opportunity to merge model checkpoints, combining their parameters to improve performance and versatility. Advances in transfer learning have led to numerous task-specific models, which model merging can integrate into powerful multitask models without additional training. MergeKit is an open-source library designed to support this process with an efficient and extensible framework suitable for any hardware. It has facilitated the merging of thousands of models, contributing to some of the world’s most powerful open-source model checkpoints. The library is accessible at: https://github.com/arcee-ai/mergekit.