Modeling LLM Unlearning as an Asymmetric Two-Task Learning Problem

Zeguan Xiao; Siqing Li; Yong Wang; Xuetao Wei; Jian Yang; Yun Chen; Guanhua Chen

Modeling LLM Unlearning as an Asymmetric Two-Task Learning Problem

Zeguan Xiao, Siqing Li, Yong Wang, Xuetao Wei, Jian Yang, Yun Chen, Guanhua Chen

Abstract

Machine unlearning for large language models (LLMs) aims to remove targeted knowledge while preserving general capability. In this paper, we recast LLM unlearning as an asymmetric two-task problem: retention is the primary objective and forgetting is an auxiliary. From this perspective, we propose a retention-prioritized gradient synthesis framework that decouples task-specific gradient extraction from conflict-aware combination. Instantiating the framework, we adapt established PCGrad to resolve gradient conflicts, and introduce SAGO, a novel retention-prioritized gradient synthesis method. Theoretically, both variants ensure non-negative cosine similarity with the retain gradient, while SAGO achieves strictly tighter alignment through constructive sign-constrained synthesis. Empirically, on WMDP Bio/Cyber and RWKU benchmarks, SAGO consistently pushes the Pareto frontier: e.g., on WMDP Bio (SimNPO+GD), recovery of target model MMLU performance progresses from 44.6% (naive) to 94.0% (+PCGrad) and further to 96.0% (+SAGO), while maintaining comparable forgetting strength. Our results show that re-shaping gradient geometry, rather than re-balancing losses, is the key to mitigating unlearning-retention trade-offs.

Anthology ID:: 2026.acl-long.890
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 19467–19477
Language:
URL:: https://aclanthology.org/2026.acl-long.890/
DOI:
Bibkey:
Cite (ACL):: Zeguan Xiao, Siqing Li, Yong Wang, Xuetao Wei, Jian Yang, Yun Chen, and Guanhua Chen. 2026. Modeling LLM Unlearning as an Asymmetric Two-Task Learning Problem. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 19467–19477, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Modeling LLM Unlearning as an Asymmetric Two-Task Learning Problem (Xiao et al., ACL 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.acl-long.890.pdf
Checklist:: 2026.acl-long.890.checklist.pdf

PDF Cite Search Checklist Fix data