Samyadeep Basu
2026
A Survey on LLM-based Conversational User Simulation
Bo Ni | Yu Wang | Leyao Wang | Branislav Kveton | Franck Dernoncourt | Yu Xia | Hongjie Chen | Reuben Luera | Samyadeep Basu | Subhojyoti Mukherjee | Puneet Mathur | Nesreen K. Ahmed | Junda Wu | Li Li | Huixin Zhang | Ruiyi Zhang | Tong Yu | Sungchul Kim | Jiuxiang Gu | Zhengzhong Tu | Alexa Siu | Zichao Wang | Seunghyun Yoon | Nedim Lipka | Namyong Park | Zihao Lin | Trung Bui | Yue Zhao | Tyler Derr | Ryan A. Rossi
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Bo Ni | Yu Wang | Leyao Wang | Branislav Kveton | Franck Dernoncourt | Yu Xia | Hongjie Chen | Reuben Luera | Samyadeep Basu | Subhojyoti Mukherjee | Puneet Mathur | Nesreen K. Ahmed | Junda Wu | Li Li | Huixin Zhang | Ruiyi Zhang | Tong Yu | Sungchul Kim | Jiuxiang Gu | Zhengzhong Tu | Alexa Siu | Zichao Wang | Seunghyun Yoon | Nedim Lipka | Namyong Park | Zihao Lin | Trung Bui | Yue Zhao | Tyler Derr | Ryan A. Rossi
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
User simulation has long played a vital role in computer science due to its potential to support a wide range of applications. Language, as the primary medium of human communication, forms the foundation of social interaction and behavior. Consequently, simulating conversational behavior has become a key area of study. Recent advancements in large language models (LLMs) have significantly catalyzed progress in this domain by enabling high-fidelity generation of synthetic user conversation. In this paper, we survey recent advancements in LLM-based conversational user simulation. We introduce a novel taxonomy covering user granularity and simulation objectives. Additionally, we systematically analyze core techniques and evaluation methodologies. We aim to keep the research community informed of the latest advancements in conversational user simulation and to further facilitate future research by identifying open challenges and organizing existing work under a unified framework.
Decomposition-Enhanced Training for Post-Hoc Attributions in Language Models
Sriram Balasubramanian | Samyadeep Basu | Koustava Goswami | Ryan A. Rossi | Varun Manjunatha | Roshan Santhosh | Ruiyi Zhang | Soheil Feizi | Nedim Lipka
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Sriram Balasubramanian | Samyadeep Basu | Koustava Goswami | Ryan A. Rossi | Varun Manjunatha | Roshan Santhosh | Ruiyi Zhang | Soheil Feizi | Nedim Lipka
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Large language models (LLMs) are increasingly used for long-document question answering, where reliable attribution to sources is critical for trust. Existing post-hoc attribution methods work well for extractive QA but struggle in multi-hop, abstractive, and semi-extractive settings, where answers synthesize information across passages. To address these challenges, we argue that post-hoc attribution can be reframed as a reasoning problem, where answers are decomposed into constituent units, each tied to specific context. We first show that prompting models to generate such decompositions alongside attributions improves performance. Building on this, we introduce DecompTune, a post-training method that teaches models to produce answer decompositions as intermediate reasoning steps. We curate a diverse dataset of complex QA tasks, annotated with decompositions by a strong LLM, and post-train Qwen-2.5 (7B and 14B) using a two-stage SFT + GRPO pipeline with task-specific curated rewards. Across extensive experiments and ablations, DecompTune substantially improves attribution quality, outperforming prior methods and matching or exceeding state-of-the-art frontier models.
2025
A Closer Look at Bias and Chain-of-Thought Faithfulness of Large (Vision) Language Models
Sriram Balasubramanian | Samyadeep Basu | Soheil Feizi
Findings of the Association for Computational Linguistics: EMNLP 2025
Sriram Balasubramanian | Samyadeep Basu | Soheil Feizi
Findings of the Association for Computational Linguistics: EMNLP 2025
Chain-of-thought (CoT) reasoning enhances performance of large language models, but questions remain about whether these reasoning traces faithfully reflect the internal processes of the model. We present the first comprehensive study of CoT faithfulness in large vision-language models (LVLMs), investigating how both text-based and previously unexplored image-based biases affect reasoning and bias articulation. Our work introduces a novel, fine-grained evaluation pipeline for categorizing bias articulation patterns, enabling significantly more precise analysis of CoT reasoning than previous methods. This framework reveals critical distinctions in how models process and respond to different types of biases, providing new insights into LVLM CoT faithfulness. Our findings reveal that subtle image-based biases are rarely articulated compared to explicit text-based ones, even in models specialized for reasoning. Additionally, many models exhibit a previously unidentified phenomenon we term “inconsistent” reasoning - correctly reasoning before abruptly changing answers, serving as a potential canary for detecting biased reasoning from unfaithful CoTs. We then apply the same evaluation pipeline to revisit CoT faithfulness in LLMs across various levels of implicit cues. Our findings reveal that current language-only reasoning models continue to struggle with articulating cues that are not overtly stated.
A Survey on Small Language Models
Chien Van Nguyen | Xuan Shen | Ryan Aponte | Yu Xia | Samyadeep Basu | Zhengmian Hu | Jian Chen | Mihir Parmar | Sasidhar Kunapuli | Joe Barrow | Junda Wu | Ashish Singh | Yu Wang | Jiuxiang Gu | Nesreen K. Ahmed | Nedim Lipka | Ruiyi Zhang | Xiang Chen | Tong Yu | Sungchul Kim | Hanieh Deilamsalehy | Namyong Park | Michael Rimer | Zhehao Zhang | Huanrui Yang | Puneet Mathur | Gang Wu | Franck Dernoncourt | Ryan A. Rossi | Thien Huu Nguyen
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era
Chien Van Nguyen | Xuan Shen | Ryan Aponte | Yu Xia | Samyadeep Basu | Zhengmian Hu | Jian Chen | Mihir Parmar | Sasidhar Kunapuli | Joe Barrow | Junda Wu | Ashish Singh | Yu Wang | Jiuxiang Gu | Nesreen K. Ahmed | Nedim Lipka | Ruiyi Zhang | Xiang Chen | Tong Yu | Sungchul Kim | Hanieh Deilamsalehy | Namyong Park | Michael Rimer | Zhehao Zhang | Huanrui Yang | Puneet Mathur | Gang Wu | Franck Dernoncourt | Ryan A. Rossi | Thien Huu Nguyen
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era
Small Language Models (SLMs) have become increasingly important due to their efficiency and performance to perform various language tasks with minimal computational resources, making them ideal for various settings including on-device, mobile, edge devices, among many others. In this article, we present a comprehensive survey on SLMs, focusing on their architectures, training techniques, and model compression techniques. We propose a novel taxonomy for categorizing the methods used to optimize SLMs, including model compression, pruning, and quantization techniques. We summarize the benchmark datasets that are useful for benchmarking SLMs along with the evaluation metrics commonly used. Additionally, we highlight key open challenges that remain to be addressed. Our survey aims to serve as a valuable resource for researchers and practitioners interested in developing and deploying small yet efficient language models.
2024
Distilling Knowledge from Text-to-Image Generative Models Improves Visio-Linguistic Reasoning in CLIP
Samyadeep Basu | Shell Xu Hu | Maziar Sanjabi | Daniela Massiceti | Soheil Feizi
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Samyadeep Basu | Shell Xu Hu | Maziar Sanjabi | Daniela Massiceti | Soheil Feizi
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Image-text contrastive models like CLIP have wide applications in zero-shot classification, image-text retrieval, and transfer learning. However, they often struggle on compositional visio-linguistic tasks (e.g., attribute-binding or object-relationships) where their performance is no better than random chance. To address this, we introduce SDS-CLIP, a lightweight and sample-efficient distillation method to enhance CLIP’s compositional visio-linguistic reasoning. Our approach fine-tunes CLIP using a distillation objective borrowed from large text-to-image generative models like Stable-Diffusion, which are known for their strong visio-linguistic reasoning abilities. On the challenging Winoground benchmark, SDS-CLIP improves the visio-linguistic performance of various CLIP models by up to 7%, while on the ARO dataset, it boosts performance by up to 3%. This work underscores the potential of well-designed distillation objectives from generative models to enhance contrastive image-text models with improved visio-linguistic reasoning capabilities.
IntCoOp: Interpretability-Aware Vision-Language Prompt Tuning
Soumya Suvra Ghosal | Samyadeep Basu | Soheil Feizi | Dinesh Manocha
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Soumya Suvra Ghosal | Samyadeep Basu | Soheil Feizi | Dinesh Manocha
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Image-text contrastive models such as CLIP learn transferable and robust representations for zero-shot transfer to a variety of downstream tasks. However, to obtain strong downstream performances, prompts need to be carefully curated, which can be a tedious engineering task. To address the issue of manual prompt engineering, prompt-tuning is used where a set of contextual vectors are learned by leveraging information from the training data. Despite their effectiveness, existing prompt-tuning frameworks often lack interpretability, thus limiting their ability to understand the compositional nature of images. In this work, we first identify that incorporating compositional attributes (e.g., a “green” tree frog) in the design of manual prompts can significantly enhance image-text alignment scores. Building upon this observation, we propose a novel and interpretable prompt-tuning method named IntCoOp, which learns to jointly align attribute-level inductive biases and class embeddings during prompt-tuning. To assess the effectiveness of our approach, we evaluate IntCoOp across two representative tasks in a few-shot learning setup: generalization to novel classes, and unseen domain shifts. Through extensive experiments across 10 downstream datasets on CLIP, we find that introducing attribute-level inductive biases leads to superior performance against state-of-art prompt tuning frameworks. Notably, in a 16-shot setup, IntCoOp improves CoOp by 7.35% in average performance across 10 diverse datasets.
2023
On Surgical Fine-tuning for Language Encoders
Abhilasha Lodha | Gayatri Belapurkar | Saloni Chalkapurkar | Yuanming Tao | Reshmi Ghosh | Samyadeep Basu | Dmitrii Petrov | Soundararajan Srinivasan
Findings of the Association for Computational Linguistics: EMNLP 2023
Abhilasha Lodha | Gayatri Belapurkar | Saloni Chalkapurkar | Yuanming Tao | Reshmi Ghosh | Samyadeep Basu | Dmitrii Petrov | Soundararajan Srinivasan
Findings of the Association for Computational Linguistics: EMNLP 2023
Fine-tuning all the layers of a pre-trained neural language encoder (either using all the parameters or using parameter-efficient methods) is often the de-facto way of adapting it to a new task. We show evidence that for different downstream language tasks, fine-tuning only a subset of layers is sufficient to obtain performance that is close to and often better than fine-tuning all the layers in the language encoder. We propose an efficient metric based on the diagonal of the Fisher information matrix (FIM score), to select the candidate layers for selective fine-tuning. We show, empirically on GLUE and SuperGLUE tasks and across distinct language encoders, that this metric can effectively select layers leading to a strong downstream performance. Our work highlights that task-specific information corresponding to a given downstream task is often localized within a few layers, and tuning only those is sufficient for strong performance. Additionally, we demonstrate the robustness of the FIM score to rank layers in a manner that remains constant during the optimization process.
2022
Strategies to Improve Few-shot Learning for Intent Classification and Slot-Filling
Samyadeep Basu | Amr Sharaf | Karine Ip Kiun Chong | Alex Fischer | Vishal Rohra | Michael Amoake | Hazem El-Hammamy | Ehi Nosakhare | Vijay Ramani | Benjamin Han
Proceedings of the Workshop on Structured and Unstructured Knowledge Integration (SUKI)
Samyadeep Basu | Amr Sharaf | Karine Ip Kiun Chong | Alex Fischer | Vishal Rohra | Michael Amoake | Hazem El-Hammamy | Ehi Nosakhare | Vijay Ramani | Benjamin Han
Proceedings of the Workshop on Structured and Unstructured Knowledge Integration (SUKI)
Intent classification (IC) and slot filling (SF) are two fundamental tasks in modern Natural Language Understanding (NLU) systems. Collecting and annotating large amounts of data to train deep learning models for such systems are not scalable. This problem can be addressed by learning from few examples using fast supervised meta-learning techniques such as prototypical networks. In this work, we systematically investigate how contrastive learning and data augmentation methods can benefit these existing meta-learning pipelines for jointly modelled IC/SF tasks. Through extensive experiments across standard IC/SF benchmarks (SNIPS and ATIS), we show that our proposed approaches outperform standard meta-learning methods: contrastive losses as a regularizer in conjunction with prototypical networks consistently outperform the existing state-of-the-art for both IC and SF tasks, while data augmentation strategies primarily improve few-shot IC by a significant margin
Search
Fix author
Co-authors
- Soheil Feizi 4
- Nedim Lipka 3
- Ryan A. Rossi 3
- Ruiyi Zhang 3
- Nesreen K. Ahmed 2
- Sriram Balasubramanian 2
- Franck Dernoncourt 2
- Jiuxiang Gu 2
- Sungchul Kim 2
- Puneet Mathur 2
- Namyong Park 2
- Junda Wu 2
- Yu Xia 2
- Tong Yu 2
- Michael Amoake 1
- Ryan Aponte 1
- Joe Barrow 1
- Gayatri Belapurkar 1
- Trung Bui 1
- Saloni Chalkapurkar 1
- Jian Chen 1
- Xiang Chen 1
- Hongjie Chen 1
- Hanieh Deilamsalehy 1
- Tyler Derr 1
- Hazem El-Hammamy 1
- Alex Fischer 1
- Soumya Suvra Ghosal 1
- Reshmi Ghosh 1
- Koustava Goswami 1
- Benjamin Han 1
- Shell Xu Hu 1
- Zhengmian Hu 1
- Karine Ip Kiun Chong 1
- Sasidhar Kunapuli 1
- Branislav Kveton 1
- Li Li 1
- Zihao Lin 1
- Abhilasha Lodha 1
- Reuben Luera 1
- Varun Manjunatha 1
- Dinesh Manocha 1
- Daniela Massiceti 1
- Subhojyoti Mukherjee 1
- Chien Van Nguyen 1
- Thien Huu Nguyen 1
- Bo Ni 1
- Ehi Nosakhare 1
- Mihir Parmar 1
- Dmitrii Petrov 1
- Vijay Ramani 1
- Michael Rimer 1
- Vishal Rohra 1
- Maziar Sanjabi 1
- Roshan Santhosh 1
- Amr Sharaf 1
- Xuan Shen 1
- Ashish Singh 1
- Alexa Siu 1
- Soundararajan Srinivasan 1
- Yuanming Tao 1
- Zhengzhong Tu 1
- Yu Wang 1
- Yu Wang 1
- Leyao Wang 1
- Zichao Wang 1
- Gang Wu 1
- Huanrui Yang 1
- Seunghyun Yoon 1
- Zhehao Zhang 1
- Huixin Zhang 1
- Yue Zhao 1