Ruixiang Tang


pdf bib
Large Language Models Can be Lazy Learners: Analyze Shortcuts in In-Context Learning
Ruixiang Tang | Dehan Kong | Longtao Huang | Hui Xue
Findings of the Association for Computational Linguistics: ACL 2023

Large language models (LLMs) have recently shown great potential for in-context learning, where LLMs learn a new task simply by conditioning on a few input-label pairs (prompts). Despite their potential, our understanding of the factors influencing end-task performance and the robustness of in-context learning remains limited. This paper aims to bridge this knowledge gap by investigating the reliance of LLMs on shortcuts or spurious correlations within prompts. Through comprehensive experiments on classification and extraction tasks, we reveal that LLMs are “lazy learners” that tend to exploit such shortcuts. Additionally, we uncover a surprising finding that larger models are more likely to utilize shortcuts in prompts during inference. Our findings provide a new perspective on evaluating robustness in in-context learning and pose new challenges for detecting and mitigating the use of shortcuts in prompts.

pdf bib
Assessing Privacy Risks in Language Models: A Case Study on Summarization Tasks
Ruixiang Tang | Gord Lueck | Rodolfo Quispe | Huseyin Inan | Janardhan Kulkarni | Xia Hu
Findings of the Association for Computational Linguistics: EMNLP 2023

Large language models have revolutionized the field of NLP by achieving state-of-the-art performance on various tasks. However, there is a concern that these models may disclose information in the training data. In this study, we focus on the summarization task and investigate the membership inference (MI) attack: given a sample and black-box access to a model’s API, it is possible to determine if the sample was part of the training data. We exploit text similarity and the model’s resistance to document modifications as potential MI signals and evaluate their effectiveness on widely used datasets. Our results demonstrate that summarization models are at risk of exposing data membership, even in cases where the reference summary is not available. Furthermore, we discuss several safeguards for training summarization models to protect against MI attacks and discuss the inherent trade-off between privacy and utility.