Souradip Chakraborty
2026
Jailbreaks as Inference-Time Alignment: A Framework for Understanding Safety Failures in LLMs
James Beetham | Souradip Chakraborty | Mengdi Wang | Furong Huang | Amrit Singh Bedi | Mubarak Shah
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
James Beetham | Souradip Chakraborty | Mengdi Wang | Furong Huang | Amrit Singh Bedi | Mubarak Shah
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Large language models (LLMs) are safety-aligned to prevent harmful response generation, yet still remain vulnerable to jailbreak attacks. While prior works have focused on improving jailbreak attack effectiveness, they offer little explanation for why safety alignment fails. We address this gap by framing jailbreaks as inference-time alignment, connecting attack design and safety alignment within a unified optimization framework. This framing allows us to extend best-of-N inference-time alignment to the adversarial setting, called LIAR (Leveraging Inference-time Alignment to jailbReak), and derive suboptimality bounds that show LIAR provably approaches an optimal jailbreak as compute scales. Interestingly, our framework allows us to develop the notion of a Safety-Net, a measure of how vulnerable an LLM is to jailbreaks, which helps to explain why safety alignment can fail. Empirically, LIAR produces natural, hard-to-detect prompts that achieve a competitive attack success rate while running 10 to 100x faster than prior suffix-based jailbreaks.
2025
Uncertainty-Aware Answer Selection for Improved Reasoning in Multi-LLM Systems
Aakriti Agrawal | Rohith Aralikatti | Anirudh Satheesh | Souradip Chakraborty | Amrit Singh Bedi | Furong Huang
Findings of the Association for Computational Linguistics: EMNLP 2025
Aakriti Agrawal | Rohith Aralikatti | Anirudh Satheesh | Souradip Chakraborty | Amrit Singh Bedi | Furong Huang
Findings of the Association for Computational Linguistics: EMNLP 2025
Large Language Models (LLMs) have demonstrated exceptional capabilities, yet selecting the most reliable response from multiple LLMs remains a challenge, particularly in resource-constrained settings. Existing approaches often depend on costly external verifiers, human evaluators, or self-consistency techniques that require multiple samples from a single model. While multi-LLM systems produce more diverse responses than single models and thus have greater potential, they often underperform compared to single LLM self-consistency. In this work, we propose a calibrated log-likelihood-based selection framework to improve multi-LLM performance. Our approach leverages uncertainty estimation to identify the most confident response while minimizing inference costs. We show that our method outperforms majority voting and exceeds self-consistency performance when using a large number of model calls. Through extensive experiments, we demonstrate improvements of approx. 4%, 3%, and 5% on GSM8K, MMLU, and ARC, respectively, when applying uncertainty-aware selection to multi-LLM systems.
2020
BioMedBERT: A Pre-trained Biomedical Language Model for QA and IR
Souradip Chakraborty | Ekaba Bisong | Shweta Bhatt | Thomas Wagner | Riley Elliott | Francesco Mosconi
Proceedings of the 28th International Conference on Computational Linguistics
Souradip Chakraborty | Ekaba Bisong | Shweta Bhatt | Thomas Wagner | Riley Elliott | Francesco Mosconi
Proceedings of the 28th International Conference on Computational Linguistics
The SARS-CoV-2 (COVID-19) pandemic spotlighted the importance of moving quickly with biomedical research. However, as the number of biomedical research papers continue to increase, the task of finding relevant articles to answer pressing questions has become significant. In this work, we propose a textual data mining tool that supports literature search to accelerate the work of researchers in the biomedical domain. We achieve this by building a neural-based deep contextual understanding model for Question-Answering (QA) and Information Retrieval (IR) tasks. We also leverage the new BREATHE dataset which is one of the largest available datasets of biomedical research literature, containing abstracts and full-text articles from ten different biomedical literature sources on which we pre-train our BioMedBERT model. Our work achieves state-of-the-art results on the QA fine-tuning task on BioASQ 5b, 6b and 7b datasets. In addition, we observe superior relevant results when BioMedBERT embeddings are used with Elasticsearch for the Information Retrieval task on the intelligently formulated BioASQ dataset. We believe our diverse dataset and our unique model architecture are what led us to achieve the state-of-the-art results for QA and IR tasks.
Transformers at SemEval-2020 Task 11: Propaganda Fragment Detection Using Diversified BERT Architectures Based Ensemble Learning
Ekansh Verma | Vinodh Motupalli | Souradip Chakraborty
Proceedings of the Fourteenth Workshop on Semantic Evaluation
Ekansh Verma | Vinodh Motupalli | Souradip Chakraborty
Proceedings of the Fourteenth Workshop on Semantic Evaluation
In this paper, we present our approach for the ’Detection of Propaganda Techniques in News Articles’ task as a part of the 2020 edition of International Workshop on Semantic Evaluation. The specific objective of this task is to identify and extract the text segments in which propaganda techniques are used. We propose a multi-system deep learning framework that can be used to identify the presence of propaganda fragments in a news article and also deep dive into the diverse enhancements of BERT architecture which are part of the final solution. Our proposed final model gave an F1-score of 0.48 on the test dataset.