Decoupling Memories, Muting Neurons: Towards Practical Machine Unlearning for Large Language Models

Lishuai Hou; Zixiong Wang; Gaoyang Liu; Chen Wang; Wei Liu; Kai Peng

doi:10.18653/v1/2025.findings-acl.719

Decoupling Memories, Muting Neurons: Towards Practical Machine Unlearning for Large Language Models

Lishuai Hou, Zixiong Wang, Gaoyang Liu, Chen Wang, Wei Liu, Kai Peng

Abstract

Machine Unlearning (MU) has emerged as a promising solution for removing the influence of data that an owner wishes to unlearn from Large Language Models (LLMs). However, existing MU methods, which require tuning the entire model parameters on the unlearned data with random labels or perturbed gradients, significantly degrade model utility, especially given the difficulty of accessing the original training data. This presents a key challenge: how can we achieve MU using only the unlearned data while preserving model utility?In this paper, we propose NeuMuter, a simple but effective MU method that eliminates the influence of unlearned data from LLMs by modulating the outputs of merely 1% of the neurons in the feed-forward network (FFN) modules within the Transformer blocks, minimizing disruption to the model’s performance. We design a trainable masking scheme that decouples the memorization of different training data within the neurons of LLMs, allowing us to precisely identify and modify neurons associated with the unlearned data. Through comprehensive evaluations on two benchmarks across four different LLMs, we demonstrate that modifying the outputs of a few fraction of the total neurons can effectively achieve MU while preserving the model’s utility across downstream tasks.

Anthology ID:: 2025.findings-acl.719
Volume:: Findings of the Association for Computational Linguistics: ACL 2025
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 13978–13999
Language:
URL:: https://aclanthology.org/2025.findings-acl.719/
DOI:: 10.18653/v1/2025.findings-acl.719
Bibkey:
Cite (ACL):: Lishuai Hou, Zixiong Wang, Gaoyang Liu, Chen Wang, Wei Liu, and Kai Peng. 2025. Decoupling Memories, Muting Neurons: Towards Practical Machine Unlearning for Large Language Models. In Findings of the Association for Computational Linguistics: ACL 2025, pages 13978–13999, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Decoupling Memories, Muting Neurons: Towards Practical Machine Unlearning for Large Language Models (Hou et al., Findings 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.findings-acl.719.pdf

PDF Cite Search Fix data