Ada Canoğlu

2026

TUNE: A Task For Turkish Machine Unlearning For Data Privacy
Doruk Benli | Ada Canoğlu | Nehir İlkim Gönençer | Dilara Keküllüoğlu
Proceedings of the Second Workshop Natural Language Processing for Turkic Languages (SIGTURK 2026)

Most large language models (LLMs) are trainedon massive datasets that include private infor-mation, which may be disclosed to third-partyusers in output generation. Developers put de-fences to prevent the generation of harmful andprivate information, but jailbreaking methodscan be used to bypass them. Machine unlearn-ing aims to remove information that may beprivate or harmful from the model’s genera-tion without retraining the model from scratch.While machine unlearning has gained somepopularity to counter the removal of privateinformation, especially in English, little to noattention has been given to Turkish unlearn-ing paradigms or existing benchmarks. In thisstudy, we introduce TUNE (Turkish Unlearn-ing Evaluation), the first benchmark datasetfor Turkish unlearning task for personal infor-mation. TUNE consists of 9842 input-targettext pairs about 50 fictitious personalities withtwo training task types: (1) Q A and (2) In-formation Request. We fine-tuned the mT5base model to evaluate various unlearning meth-ods, including our proposed approach. We findthat while current methods can help unlearnunwanted private information in Turkish, theyalso unlearn other information we want to re-tain in the model.

Co-authors

Venues

SIGTURK1
WS1

Fix author