LABO: Towards Learning Optimal Label Regularization via Bi-level Optimization

Peng Lu, Ahmad Rashid, Ivan Kobyzev, Mehdi Rezagholizadeh, Phillippe Langlais


Abstract
Regularization techniques are crucial to improving the generalization performance and training efficiency of deep neural networks. Many deep learning algorithms rely on weight decay, dropout, batch/layer normalization to converge faster and generalize. Label Smoothing (LS) is another simple, versatile and efficient regularization which can be applied to various supervised classification tasks. Conventional LS, however, regardless of the training instance assumes that each non-target class is equally likely. In this work, we present a general framework for training with label regularization, which includes conventional LS but can also model instance-specific variants. Based on this formulation, we propose an efficient way of learning LAbel regularization by devising a Bi-level Optimization (LABO) problem. We derive a deterministic and interpretable solution of the inner loop as the optimal label smoothing without the need to store the parameters or the output of a trained model. Finally, we conduct extensive experiments and demonstrate our LABO consistently yields improvement over conventional label regularization on various fields, including seven machine translation and three image classification tasks across various neural network architectures while maintaining training efficiency.
Anthology ID:
2023.findings-acl.356
Volume:
Findings of the Association for Computational Linguistics: ACL 2023
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5759–5774
Language:
URL:
https://aclanthology.org/2023.findings-acl.356
DOI:
10.18653/v1/2023.findings-acl.356
Bibkey:
Cite (ACL):
Peng Lu, Ahmad Rashid, Ivan Kobyzev, Mehdi Rezagholizadeh, and Phillippe Langlais. 2023. LABO: Towards Learning Optimal Label Regularization via Bi-level Optimization. In Findings of the Association for Computational Linguistics: ACL 2023, pages 5759–5774, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
LABO: Towards Learning Optimal Label Regularization via Bi-level Optimization (Lu et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-acl.356.pdf
Video:
 https://aclanthology.org/2023.findings-acl.356.mp4