Posterior Differential Regularization with f-divergence for Improving Model Robustness

Hao Cheng, Xiaodong Liu, Lis Pereira, Yaoliang Yu, Jianfeng Gao


Abstract
We address the problem of enhancing model robustness through regularization. Specifically, we focus on methods that regularize the model posterior difference between clean and noisy inputs. Theoretically, we provide a connection of two recent methods, Jacobian Regularization and Virtual Adversarial Training, under this framework. Additionally, we generalize the posterior differential regularization to the family of f-divergences and characterize the overall framework in terms of the Jacobian matrix. Empirically, we compare those regularizations and standard BERT training on a diverse set of tasks to provide a comprehensive profile of their effect on model generalization. For both fully supervised and semi-supervised settings, we show that regularizing the posterior difference with f-divergence can result in well-improved model robustness. In particular, with a proper f-divergence, a BERT-base model can achieve comparable generalization as its BERT-large counterpart for in-domain, adversarial and domain shift scenarios, indicating the great potential of the proposed framework for enhancing NLP model robustness.
Anthology ID:
2021.naacl-main.85
Volume:
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:
June
Year:
2021
Address:
Online
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1078–1089
Language:
URL:
https://aclanthology.org/2021.naacl-main.85
DOI:
10.18653/v1/2021.naacl-main.85
Bibkey:
Copy Citation:
PDF:
https://aclanthology.org/2021.naacl-main.85.pdf