Not Every Token Needs Forgetting: Selective Unlearning Balancing Forgetting and Utility in Large Language Models

Yixin Wan; Anil Ramakrishna; Kai-Wei Chang; Volkan Cevher; Rahul Gupta

doi:10.18653/v1/2025.findings-emnlp.96

Not Every Token Needs Forgetting: Selective Unlearning Balancing Forgetting and Utility in Large Language Models

Yixin Wan, Anil Ramakrishna, Kai-Wei Chang, Volkan Cevher, Rahul Gupta

Abstract

Large Language Model (LLM) unlearning has recently gained significant attention, driven by the need to remove unwanted information—such as private, sensitive, or copyrighted content—from trained models. However, conventional unlearning approaches indiscriminately update model parameters to forget all tokens in a target document, including common tokens (e.g., pronouns, prepositions, general nouns) that carry general knowledge. In this paper, we highlight that “not every token needs forgetting”. We propose **Selective Unlearning (SU)**, which identifies a critical subset of tokens within the forgetting set that is relevant to the unwanted information, and unlearns only those tokens. Experiments on two benchmarks and six baseline unlearning algorithms demonstrate that SU not only achieves effective unlearning on the targeted forget data, but also significantly preserves the model’s utility in the retaining set.

Anthology ID:: 2025.findings-emnlp.96
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2025
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1827–1835
Language:
URL:: https://aclanthology.org/2025.findings-emnlp.96/
DOI:: 10.18653/v1/2025.findings-emnlp.96
Bibkey:
Cite (ACL):: Yixin Wan, Anil Ramakrishna, Kai-Wei Chang, Volkan Cevher, and Rahul Gupta. 2025. Not Every Token Needs Forgetting: Selective Unlearning Balancing Forgetting and Utility in Large Language Models. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 1827–1835, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Not Every Token Needs Forgetting: Selective Unlearning Balancing Forgetting and Utility in Large Language Models (Wan et al., Findings 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.findings-emnlp.96.pdf
Checklist:: 2025.findings-emnlp.96.checklist.pdf

PDF Cite Search Checklist Fix data