ARCH: Efficient Adversarial Regularized Training with Caching

Simiao Zuo, Chen Liang, Haoming Jiang, Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen, Tuo Zhao


Abstract
Adversarial regularization can improve model generalization in many natural language processing tasks. However, conventional approaches are computationally expensive since they need to generate a perturbation for each sample in each epoch. We propose a new adversarial regularization method ARCH (adversarial regularization with caching), where perturbations are generated and cached once every several epochs. As caching all the perturbations imposes memory usage concerns, we adopt a K-nearest neighbors-based strategy to tackle this issue. The strategy only requires caching a small amount of perturbations, without introducing additional training time. We evaluate our proposed method on a set of neural machine translation and natural language understanding tasks. We observe that ARCH significantly eases the computational burden (saves up to 70% of computational time in comparison with conventional approaches). More surprisingly, by reducing the variance of stochastic gradients, ARCH produces a notably better (in most of the tasks) or comparable model generalization. Our code is publicly available.
Anthology ID:
2021.findings-emnlp.348
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2021
Month:
November
Year:
2021
Address:
Punta Cana, Dominican Republic
Editors:
Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:
Findings
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
4118–4131
Language:
URL:
https://aclanthology.org/2021.findings-emnlp.348
DOI:
10.18653/v1/2021.findings-emnlp.348
Bibkey:
Cite (ACL):
Simiao Zuo, Chen Liang, Haoming Jiang, Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen, and Tuo Zhao. 2021. ARCH: Efficient Adversarial Regularized Training with Caching. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 4118–4131, Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
ARCH: Efficient Adversarial Regularized Training with Caching (Zuo et al., Findings 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.findings-emnlp.348.pdf
Video:
 https://aclanthology.org/2021.findings-emnlp.348.mp4
Code
 SimiaoZuo/Caching-Adv
Data
ANLICoLAGLUEMRPCSSTSST-2