Ophir Frieder

2025

GUIR at SemEval-2025 Task 4: Adaptive Weight Tuning with Gradual Negative Matching for LLM Unlearning
Hrishikesh Kulkarni | Nazli Goharian | Ophir Frieder
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)

Machine Unlearning for Large Language Models, referred to as LLM Unlearning is getting more and more attention as a result of regurgitation of sensitive and harmful content. In this paper, we present our method architecture, results, and analysis of our submission to Task4: Unlearning sensitive content from Large Language Models. This task includes three subtasks of LLM Unlearning on 1) Long Synthetic documents, 2) Short Synthetic documents, and 3) Real Training documents. Getting rid of the impact of undesirable and unauthorized responses is the core objective of unlearning. Furthermore, it is expected that unlearning should not have an adverse impact on the usability of the model. In this paper, we provide an approach for LLM unlearning that tries to make the model forget while maintaining usability of the model. We perform adaptive weight tuning with Gradient Ascent, KL minimization and Gradual Negative Matching loss functions. Our submission balances retain and forget abilities of the model while outperforming provided benchmarks.

2022

pdf bib abs

TBD3: A Thresholding-Based Dynamic Depression Detection from Social Media for Low-Resource Users
Hrishikesh Kulkarni | Sean MacAvaney | Nazli Goharian | Ophir Frieder
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Social media are heavily used by many users to share their mental health concerns and diagnoses. This trend has turned social media into a large-scale resource for researchers focused on detecting mental health conditions. Social media usage varies considerably across individuals. Thus, classification of patterns, including detecting signs of depression, must account for such variation. We address the disparity in classification effectiveness for users with little activity (e.g., new users). Our evaluation, performed on a large-scale dataset, shows considerable detection discrepancy based on user posting frequency. For instance, the F1 detection score of users with an above-median versus below-median number of posts is greater than double (0.803 vs 0.365) using a conventional CNN-based model; similar results were observed on lexical and transformer-based classifiers. To complement this evaluation, we propose a dynamic thresholding technique that adjusts the classifier’s sensitivity as a function of the number of posts a user has. This technique alone reduces the margin between users with many and few posts, on average, by 45% across all methods and increases overall performance, on average, by 33%. These findings emphasize the importance of evaluating and tuning natural language systems for potentially vulnerable populations.

2020

pdf bib abs

Offensive language detection is an important and challenging task in natural language processing. We present our submissions to the OffensEval 2020 shared task, which includes three English sub-tasks: identifying the presence of offensive language (Sub-task A), identifying the presence of target in offensive language (Sub-task B), and identifying the categories of the target (Sub-task C). Our experiments explore using a domain-tuned contextualized language model (namely, BERT) for this task. We also experiment with different components and configurations (e.g., a multi-view SVM) stacked upon BERT models for specific sub-tasks. Our submissions achieve F1 scores of 91.7% in Sub-task A, 66.5% in Sub-task B, and 63.2% in Sub-task C. We perform an ablation study which reveals that domain tuning considerably improves the classification performance. Furthermore, error analysis shows common misclassification errors made by our model and outlines research directions for future.

2016

pdf bib abs

Effects of Sampling on Twitter Trend Detection
Andrew Yates | Alek Kolcz | Nazli Goharian | Ophir Frieder
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Much research has focused on detecting trends on Twitter, including health-related trends such as mentions of Influenza-like illnesses or their symptoms. The majority of this research has been conducted using Twitter’s public feed, which includes only about 1% of all public tweets. It is unclear if, when, and how using Twitter’s 1% feed has affected the evaluation of trend detection methods. In this work we use a larger feed to investigate the effects of sampling on Twitter trend detection. We focus on using health-related trends to estimate the prevalence of Influenza-like illnesses based on tweets. We use ground truth obtained from the CDC and Google Flu Trends to explore how the prevalence estimates degrade when moving from a 100% to a 1% sample. We find that using the 1% sample is unlikely to substantially harm ILI estimates made at the national level, but can cause poor performance when estimates are made at the city level.

2014

pdf bib abs

A Framework for Public Health Surveillance
Andrew Yates | Jon Parker | Nazli Goharian | Ophir Frieder
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

With the rapid growth of social media, there is increasing potential to augment traditional public health surveillance methods with data from social media. We describe a framework for performing public health surveillance on Twitter data. Our framework, which is publicly available, consists of three components that work together to detect health-related trends in social media: a concept extraction component for identifying health-related concepts, a concept aggregation component for identifying how the extracted health-related concepts relate to each other, and a trend detection component for determining when the aggregated health-related concepts are trending. We describe the architecture of the framework and several components that have been implemented in the framework, identify other components that could be used with the framework, and evaluate our framework on approximately 1.5 years of tweets. While it is difficult to determine how accurately a Twitter trend reflects a trend in the real world, we discuss the differences in trends detected by several different methods and compare flu trends detected by our framework to data from Google Flu Trends.

Co-authors

Venues

Fix author