Characterizing the Impacts of Instances on Robustness

Rui Zheng, Zhiheng Xi, Qin Liu, Wenbin Lai, Tao Gui, Qi Zhang, Xuanjing Huang, Jin Ma, Ying Shan, Weifeng Ge


Abstract
Building robust deep neural networks (DNNs) against adversarial attacks is an important but challenging task. Previous defense approaches mainly focus on developing new model structures or training algorithms, but they do little to tap the potential of training instances, especially instances with robust patterns carring innate robustness. In this paper, we show that robust and non-robust instances in the training dataset, though are both important for test performance, have contrary impacts on robustness, which makes it possible to build a highly robust model by leveraging the training dataset in a more effective way. We propose a new method that can distinguish between robust instances from non-robust ones according to the model’s sensitivity to perturbations on individual instances during training. Surprisingly, we find that the model under standard training easily overfits the robust instances by relying on their simple patterns before the model completely learns their robust features. Finally, we propose a new mitigation algorithm to further release the potential of robust instances. Experimental results show that proper use of robust instances in the original dataset is a new line to achieve highly robust models.
Anthology ID:
2023.findings-acl.146
Volume:
Findings of the Association for Computational Linguistics: ACL 2023
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2314–2332
Language:
URL:
https://aclanthology.org/2023.findings-acl.146
DOI:
10.18653/v1/2023.findings-acl.146
Bibkey:
Cite (ACL):
Rui Zheng, Zhiheng Xi, Qin Liu, Wenbin Lai, Tao Gui, Qi Zhang, Xuanjing Huang, Jin Ma, Ying Shan, and Weifeng Ge. 2023. Characterizing the Impacts of Instances on Robustness. In Findings of the Association for Computational Linguistics: ACL 2023, pages 2314–2332, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Characterizing the Impacts of Instances on Robustness (Zheng et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-acl.146.pdf
Video:
 https://aclanthology.org/2023.findings-acl.146.mp4