Instance-adaptive training with noise-robust losses against noisy labels

Lifeng Jin, Linfeng Song, Kun Xu, Dong Yu


Abstract
In order to alleviate the huge demand for annotated datasets for different tasks, many recent natural language processing datasets have adopted automated pipelines for fast-tracking usable data. However, model training with such datasets poses a challenge because popular optimization objectives are not robust to label noise induced in the annotation generation process. Several noise-robust losses have been proposed and evaluated on tasks in computer vision, but they generally use a single dataset-wise hyperparamter to control the strength of noise resistance. This work proposes novel instance-adaptive training frameworks to change single dataset-wise hyperparameters of noise resistance in such losses to be instance-wise. Such instance-wise noise resistance hyperparameters are predicted by special instance-level label quality predictors, which are trained along with the main classification models. Experiments on noisy and corrupted NLP datasets show that proposed instance-adaptive training frameworks help increase the noise-robustness provided by such losses, promoting the use of the frameworks and associated losses in NLP models trained with noisy data.
Anthology ID:
2021.emnlp-main.457
Volume:
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2021
Address:
Online and Punta Cana, Dominican Republic
Editors:
Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5647–5663
Language:
URL:
https://aclanthology.org/2021.emnlp-main.457
DOI:
10.18653/v1/2021.emnlp-main.457
Bibkey:
Cite (ACL):
Lifeng Jin, Linfeng Song, Kun Xu, and Dong Yu. 2021. Instance-adaptive training with noise-robust losses against noisy labels. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 5647–5663, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
Instance-adaptive training with noise-robust losses against noisy labels (Jin et al., EMNLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.emnlp-main.457.pdf
Video:
 https://aclanthology.org/2021.emnlp-main.457.mp4
Data
GLUESSTSST-2