Noisy Label Regularisation for Textual Regression

Yuxia Wang, Timothy Baldwin, Karin Verspoor


Abstract
Training with noisy labelled data is known to be detrimental to model performance, especially for high-capacity neural network models in low-resource domains. Our experiments suggest that standard regularisation strategies, such as weight decay and dropout, are ineffective in the face of noisy labels. We propose a simple noisy label detection method that prevents error propagation from the input layer. The approach is based on the observation that the projection of noisy labels is learned through memorisation at advanced stages of learning, and that the Pearson correlation is sensitive to outliers. Extensive experiments over real-world human-disagreement annotations as well as randomly-corrupted and data-augmented labels, across various tasks and domains, demonstrate that our method is effective, regularising noisy labels and improving generalisation performance.
Anthology ID:
2022.coling-1.371
Volume:
Proceedings of the 29th International Conference on Computational Linguistics
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
4228–4240
Language:
URL:
https://aclanthology.org/2022.coling-1.371
DOI:
Bibkey:
Cite (ACL):
Yuxia Wang, Timothy Baldwin, and Karin Verspoor. 2022. Noisy Label Regularisation for Textual Regression. In Proceedings of the 29th International Conference on Computational Linguistics, pages 4228–4240, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
Cite (Informal):
Noisy Label Regularisation for Textual Regression (Wang et al., COLING 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.coling-1.371.pdf
Code
 yuxiaw/regularise-regression-noisy-labels
Data
IMDb Movie ReviewsPeerRead