Hoorieh Sabzevari


2025

Unlearning is a critical capability for ensuring privacy, security, and compliance in AI systems, enabling models to forget specific data while retaining overall performance. In this work, we participated in Task 4 of SemEval 2025, which focused on unlearning across three sub-tasks: (1) long-form synthetic creative documents, (2) short-form synthetic biographies containing personally identifiable information, and (3) real documents sampled from the target model’s training dataset. We conducted four experiments, employing Supervised Fine-Tuning (SFT) and Negative Preference Optimization (NPO). Despite achieving good performance on the retain set—data that the model was supposed to remember—our findings demonstrate that these techniques did not perform well on the forget set, where unlearning was required.

2024

This study investigates the performance of the zero-shot method in classifying data using three large language models, alongside two models with large input token sizes and the two pre-trained models on legal data. Our main dataset comes from the domain of U.S. civil procedure. It includes summaries of legal cases, specific questions, potential answers, and detailed explanations for why each solution is relevant, all sourced from a book aimed at law students. By comparing different methods, we aimed to understand how effectively they handle the complexities found in legal datasets. Our findings show how well the zero-shot method of large language models can understand complicated data. We achieved our highest F1 score of 64% in these experiments.