Bin Zhu

Other people with similar names: Bin Benjamin Zhu

Unverified author pages with similar names: Bin Zhu

2023

Exploring Robust Overfitting for Pre-trained Language Models
Bin Zhu | Yanghui Rao
Findings of the Association for Computational Linguistics: ACL 2023

We identify the robust overfitting issue for pre-trained language models by showing that the robust test loss increases as the epoch grows. Through comprehensive exploration of the robust loss on the training set, we attribute robust overfitting to the model’s memorization of the adversarial training data. We attempt to mitigate robust overfitting by combining regularization methods with adversarial training. Following the philosophy that prevents the model from memorizing the adversarial data, we find that flooding, a regularization method with loss scaling, can mitigate robust overfitting for pre-trained language models. Eventually, we investigate the effect of flooding levels and evaluate the models’ adversarial robustness under textual attacks. Extensive experiments demonstrate that our methods can mitigate robust overfitting upon three top adversarial training methods and further promote adversarial robustness.

Co-authors

Yanghui Rao 1

Venues

Findings1

Fix author