Suli Wang
2024
Shortcuts Arising from Contrast: Towards Effective and Lightweight Clean-Label Attacks in Prompt-Based Learning
Xiaopeng Xie
|
Ming Yan
|
Xiwen Zhou
|
Chenlong Zhao
|
Suli Wang
|
Yong Zhang
|
Joey Zhou
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Prompt-based learning paradigm has been shown to be vulnerable to backdoor attacks. Current clean-label attack, employing a specific prompt as trigger, can achieve success without the need for external triggers and ensuring correct labeling of poisoned samples, which are more stealthy compared to the poisoned-label attack, but on the other hand, facing significant issues with false activations and pose greater challenges, necessitating a higher rate of poisoning. Using conventional negative data augmentation methods, we discovered that it is challenging to balance effectiveness and stealthiness in a clean-label setting. In addressing this issue, we are inspired by the notion that a backdoor acts as a shortcut, and posit that this shortcut stems from the contrast between the trigger and the data utilized for poisoning. In this study, we propose a method named Contrastive Shortcut Injection (CSI), by leveraging activation values, integrates trigger design and data selection strategies to craft stronger shortcut features. With extensive experiments on full-shot and few-shot text classification tasks, we empirically validate CSI’s high effectiveness and high stealthiness at low poisoning rates.
Search
Co-authors
- Xiaopeng Xie 1
- Ming Yan 1
- Xiwen Zhou 1
- Chenlong Zhao 1
- Yong Zhang 1
- show all...