Sadat Shahriar

2026

Aligning Large Language Models (LLM) to address subjectivity and nuanced preference levels requires adequate flexibility and control, which can be a resource-intensive and time-consuming procedure. Existing training-time alignment methods require full re-training when a change is needed and inference-time ones typically require access to the reward model at each inference step. We introduce **MEAV**, an inference-time model-editing-based LLM alignment method that learns encoded representations of preference dimensions, called *Alignment Vectors* (AV). These representations enable dynamic adjusting of the model behavior during inference through simple linear operations. Here, we focus on three gradual response levels across three specialized domains: medical, legal, and financial, exemplifying its practical potential. This new alignment paradigm introduces adjustable preference knobs during inference, allowing users to tailor their LLM outputs while reducing the inference cost by half compared to the prompt engineering approach. Additionally, we find that AVs are transferable across different fine-tuning stages of the same model, demonstrating their flexibility. AVs also facilitate multidomain, diverse preference alignment, making the process 12x faster than the retraining approach.

2025

pdf bib abs

The Erosion of LLM Signatures: Can We Still Distinguish Human and LLM-Generated Scientific Ideas after Iterative Paraphrasing?
Sadat Shahriar | Navid Ayoobi | Arjun Mukherjee
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era

With the increasing reliance on LLMs as research agents, distinguishing between LLM and human-generated ideas has become crucial for understanding the cognitive nuances of LLMs’ research capabilities. While detecting LLM-generated text has been extensively studied, distinguishing human vs LLM-generated *scientific ideas* remains an unexplored area. In this work, we systematically evaluate the ability of state-of-the-art (SOTA) machine learning models to differentiate between human and LLM-generated ideas, particularly after successive paraphrasing stages. Our findings highlight the challenges SOTA models face in source attribution, with detection performance declining by an average of 25.4% after five consecutive paraphrasing stages. Additionally, we demonstrate that incorporating the research problem as contextual information improves detection performance by up to 2.97%. Notably, our analysis reveals that detection algorithms struggle significantly when ideas are paraphrased into a simplified, non-expert style, contributing the most to the erosion of distinguishable LLM signatures.

pdf bib abs

Exposing Pink Slime Journalism: Linguistic Signatures and Robust Detection against LLM-Generated Threats
Sadat Shahriar | Navid Ayoobi | Arjun Mukherjee | Mostafa Musharrat | Sai Vishnu Vamsi Senagasetty
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era

The local news landscape, a vital source of reliable information for 28 million Americans, faces a growing threat from Pink Slime Journalism, a low-quality, auto-generated articles that mimic legitimate local reporting. Detecting these deceptive articles requires a fine-grained analysis of their linguistic, stylistic, and lexical characteristics. In this work, we conduct a comprehensive study to uncover the distinguishing patterns of Pink Slime content and propose detection strategies based on these insights. Beyond traditional generation methods, we highlight a new adversarial vector: modifications through large language models (LLMs). Our findings reveal that even consumer-accessible LLMs can significantly undermine existing detection systems, reducing their performance by up to 40% in F1-score. To counter this threat, we introduce a robust learning framework specifically designed to resist LLM-based adversarial attacks and adapt to the evolving landscape of automated pink slime journalism, and showed and improvement by up to 27%.

2024

pdf bib abs

HU at SemEval-2024 Task 8A: Can Contrastive Learning Learn Embeddings to Detect Machine-Generated Text?
Shubhashis Roy Dipta | Sadat Shahriar
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

This paper describes our system developed for SemEval-2024 Task 8, “Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection.” Machine-generated texts have been one of the main concerns due to the use of large language models (LLM) in fake text generation, phishing, cheating in exams, or even plagiarizing copyright materials. A lot of systems have been developed to detect machine-generated text. Nonetheless, the majority of these systems rely on the text-generating model. This limitation is impractical in real-world scenarios, as it’s often impossible to know which specific model the user has used for text generation. In this work, we propose a single model based on contrastive learning, which uses ~40% of the baseline’s parameters (149M vs. 355M) but shows a comparable performance on the test dataset (21st out of 137 participants). Our key finding is that even without an ensemble of multiple models, a single base model can have comparable performance with the help of data augmentation and contrastive learning.

2023

pdf bib abs

SafeWebUH at SemEval-2023 Task 11: Learning Annotator Disagreement in Derogatory Text: Comparison of Direct Training vs Aggregation
Sadat Shahriar | Thamar Solorio
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

Subjectivity and difference of opinion are key social phenomena, and it is crucial to take these into account in the annotation and detection process of derogatory textual content. In this paper, we use four datasets provided by SemEval-2023 Task 11 and fine-tune a BERT model to capture the disagreement in the annotation. We find individual annotator modeling and aggregation lowers the Cross-Entropy score by an average of 0.21, compared to the direct training on the soft labels. Our findings further demonstrate that annotator metadata contributes to the average 0.029 reduction in the Cross-Entropy score.

pdf bib abs

Exploring Deceptive Domain Transfer Strategies: Mitigating the Differences among Deceptive Domains
Sadat Shahriar | Arjun Mukherjee | Omprakash Gnawali
Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing

Deceptive text poses a significant threat to users, resulting in widespread misinformation and disorder. While researchers have created numerous cutting-edge techniques for detecting deception in domain-specific settings, whether there is a generic deception pattern so that deception-related knowledge in one domain can be transferred to the other remains mostly unexplored. Moreover, the disparities in textual expression across these many mediums pose an additional obstacle for generalization. To this end, we present a Multi-Task Learning (MTL)-based deception generalization strategy to reduce the domain-specific noise and facilitate a better understanding of deception via a generalized training. As deceptive domains, we use News (fake news), Tweets (rumors), and Reviews (fake reviews) and employ LSTM and BERT model to incorporate domain transfer techniques. Our proposed architecture for the combined approach of domain-independent and domain-specific training improves the deception detection performance by up to 5.28% in F1-score.

pdf bib abs

Tackling the Myriads of Collusion Scams on YouTube Comments of Cryptocurrency Videos
Sadat Shahriar | Arjun Mukherjee
Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing

Despite repeated measures, YouTube’s comment section has been a fertile ground for scammers. With the growth of the cryptocurrency market and obscurity around it, a new form of scam, namely “Collusion Scam” has emerged as a dominant force within YouTube’s comment space. Unlike typical scams and spams, collusion scams employ a cunning persuasion strategy, using the facade of genuine social interactions within comment threads to create an aura of trust and success to entrap innocent users. In this research, we collect 1,174 such collusion scam threads and perform a detailed analysis, which is tailored towards the successful detection of these scams. We find that utilization of the collusion dynamics can provide an accuracy of 96.67% and an F1-score of 93.04%. Furthermore, we demonstrate the robust predictive power of metadata associated with these threads and user channels, which act as compelling indicators of collusion scams. Finally, we show that modern LLM, like chatGPT, can effectively detect collusion scams without the need for any training.

2021

pdf bib abs

A Domain-Independent Holistic Approach to Deception Detection
Sadat Shahriar | Arjun Mukherjee | Omprakash Gnawali
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)

The deception in the text can be of different forms in different domains, including fake news, rumor tweets, and spam emails. Irrespective of the domain, the main intent of the deceptive text is to deceit the reader. Although domain-specific deception detection exists, domain-independent deception detection can provide a holistic picture, which can be crucial to understand how deception occurs in the text. In this paper, we detect deception in a domain-independent setting using deep learning architectures. Our method outperforms the State-of-the-Art performance of most benchmark datasets with an overall accuracy of 93.42% and F1-Score of 93.22%. The domain-independent training allows us to capture subtler nuances of deceptive writing style. Furthermore, we analyze how much in-domain data may be helpful to accurately detect deception, especially for the cases where data may not be readily available to train. Our results and analysis indicate that there may be a universal pattern of deception lying in-between the text independent of the domain, which can create a novel area of research and open up new avenues in the field of deception detection.

Co-authors

Shubhashis Roy Dipta 1

Sai Vishnu Vamsi Senagasetty 1

Thamar Solorio 1

Monica Sunkara 1

Venues

Fix author