LUME: LLM Unlearning with Multitask Evaluations

Anil Ramakrishna, Yixin Wan, Xiaomeng Jin, Kai-Wei Chang, Zhiqi Bu, Bhanukiran Vinzamuri, Volkan Cevher, Mingyi Hong, Rahul Gupta


Abstract
Unlearning aims to remove copyrighted, sensitive, or private content from large language models (LLMs) without a full retraining. In this work, we develop a multi-task unlearning benchmark LUME that features three tasks: (1) unlearn synthetically generated creative short novels, (2) unlearn synthetic biographies with sensitive information, and (3) unlearn a collection of public biographies. We further release two fine-tuned LLMs of 1B and 7B parameter sizes as the target models. We conduct detailed evaluations of several recently-proposed algorithms and present results on carefully crafted metrics to understand their behavior and limitations.
Anthology ID:
2025.findings-emnlp.347
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2025
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6524–6535
Language:
URL:
https://aclanthology.org/2025.findings-emnlp.347/
DOI:
Bibkey:
Cite (ACL):
Anil Ramakrishna, Yixin Wan, Xiaomeng Jin, Kai-Wei Chang, Zhiqi Bu, Bhanukiran Vinzamuri, Volkan Cevher, Mingyi Hong, and Rahul Gupta. 2025. LUME: LLM Unlearning with Multitask Evaluations. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 6524–6535, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
LUME: LLM Unlearning with Multitask Evaluations (Ramakrishna et al., Findings 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.findings-emnlp.347.pdf
Checklist:
 2025.findings-emnlp.347.checklist.pdf