Howard University-AI4PC at SemEval-2025 Task 4: Unlearning Sensitive Content From Large Language Models Using Finetuning and Distillation for Selective Knowledge Removal

Aayush Acharya; Saurav Aryal

Howard University-AI4PC at SemEval-2025 Task 4: Unlearning Sensitive Content From Large Language Models Using Finetuning and Distillation for Selective Knowledge Removal

Abstract

This paper presents our approach and submission to the SemEval 2025 task on “Unlearning Sensitive Content from Large Language Models.” The task focuses on making LLMs forget specific knowledge, such as copyrighted material and personally identifiable information (PII), without needing expensive retraining from scratch on the OLMo model. We propose a method to unlearn using fine-tuning and knowledge distillation. Our approach involves fine-tuning separate models on “retain” and “forget” datasets to preserve or suppress knowledge selectively. We then distill the model by suppressing logarithmic data from the fine-tuned model without learning using a combined loss of L2, KL divergence and cosine similarity while retaining knowledge from the fine-tuned model with retention using KL divergence loss.

Anthology ID:: 2025.semeval-1.233
Volume:: Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Sara Rosenthal, Aiala Rosá, Debanjan Ghosh, Marcos Zampieri
Venues:: SemEval | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1772–1776
Language:
URL:: https://aclanthology.org/2025.semeval-1.233/
DOI:
Bibkey:
Cite (ACL):: Aayush Acharya and Saurav Aryal. 2025. Howard University-AI4PC at SemEval-2025 Task 4: Unlearning Sensitive Content From Large Language Models Using Finetuning and Distillation for Selective Knowledge Removal. In Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025), pages 1772–1776, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Howard University-AI4PC at SemEval-2025 Task 4: Unlearning Sensitive Content From Large Language Models Using Finetuning and Distillation for Selective Knowledge Removal (Acharya & Aryal, SemEval 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.semeval-1.233.pdf

PDF Cite Search Fix data