ClusterUCB: Efficient Gradient-Based Data Selection for Targeted Fine-Tuning of LLMs

Zige Wang; Qi Zhu; Fei Mi; Minghui Xu; Ruochun Jin; Wenjing Yang

doi:10.18653/v1/2025.findings-emnlp.1026

ClusterUCB: Efficient Gradient-Based Data Selection for Targeted Fine-Tuning of LLMs

Zige Wang, Qi Zhu, Fei Mi, Minghui Xu, Ruochun Jin, Wenjing Yang

Abstract

Gradient-based data influence approximation has been leveraged to select useful data samples in the supervised fine-tuning of large language models. However, the computation of gradients throughout the fine-tuning process requires too many resources to be feasible in practice. In this paper, we propose an efficient gradient-based data selection framework with clustering and a modified Upper Confidence Bound (UCB) algorithm. Based on the intuition that data samples with similar gradient features will have similar influences, we first perform clustering on the training data pool. Then, we frame the inter-cluster data selection as a constrained computing budget allocation problem and consider it a multi-armed bandit problem. A modified UCB algorithm is leveraged to solve this problem. Specifically, during the iterative sampling process, historical data influence information is recorded to directly estimate the distributions of each cluster, and a cold start is adopted to balance exploration and exploitation. Experimental results on various benchmarks show that our proposed framework, ClusterUCB, can achieve comparable results to the original gradient-based data selection methods while greatly reducing computing consumption.

Anthology ID:: 2025.findings-emnlp.1026
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2025
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 18867–18880
Language:
URL:: https://aclanthology.org/2025.findings-emnlp.1026/
DOI:: 10.18653/v1/2025.findings-emnlp.1026
Bibkey:
Cite (ACL):: Zige Wang, Qi Zhu, Fei Mi, Minghui Xu, Ruochun Jin, and Wenjing Yang. 2025. ClusterUCB: Efficient Gradient-Based Data Selection for Targeted Fine-Tuning of LLMs. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 18867–18880, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: ClusterUCB: Efficient Gradient-Based Data Selection for Targeted Fine-Tuning of LLMs (Wang et al., Findings 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.findings-emnlp.1026.pdf
Checklist:: 2025.findings-emnlp.1026.checklist.pdf

PDF Cite Search Checklist Fix data