CIAug: Equipping Interpolative Augmentation with Curriculum Learning

Ramit Sawhney; Ritesh Soun; Shrey Pandit; Megh Thakkar; Sarvagya Malaviya; Yuval Pinter

doi:10.18653/v1/2022.naacl-main.127

CIAug: Equipping Interpolative Augmentation with Curriculum Learning

Ramit Sawhney, Ritesh Soun, Shrey Pandit, Megh Thakkar, Sarvagya Malaviya, Yuval Pinter

Abstract

Interpolative data augmentation has proven to be effective for NLP tasks. Despite its merits, the sample selection process in mixup is random, which might make it difficult for the model to generalize better and converge faster. We propose CIAug, a novel curriculum-based learning method that builds upon mixup. It leverages the relative position of samples in hyperbolic embedding space as a complexity measure to gradually mix up increasingly difficult and diverse samples along training. CIAug achieves state-of-the-art results over existing interpolative augmentation methods on 10 benchmark datasets across 4 languages in text classification and named-entity recognition tasks. It also converges and achieves benchmark F1 scores 3 times faster. We empirically analyze the various components of CIAug, and evaluate its robustness against adversarial attacks.

Anthology ID:: 2022.naacl-main.127
Volume:: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:: July
Year:: 2022
Address:: Seattle, United States
Editors:: Marine Carpuat, Marie-Catherine de Marneffe, Ivan Vladimir Meza Ruiz
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1758–1764
Language:
URL:: https://aclanthology.org/2022.naacl-main.127
DOI:: 10.18653/v1/2022.naacl-main.127
Bibkey:
Cite (ACL):: Ramit Sawhney, Ritesh Soun, Shrey Pandit, Megh Thakkar, Sarvagya Malaviya, and Yuval Pinter. 2022. CIAug: Equipping Interpolative Augmentation with Curriculum Learning. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1758–1764, Seattle, United States. Association for Computational Linguistics.
Cite (Informal):: CIAug: Equipping Interpolative Augmentation with Curriculum Learning (Sawhney et al., NAACL 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.naacl-main.127.pdf
Software:: 2022.naacl-main.127.software.zip
Video:: https://aclanthology.org/2022.naacl-main.127.mp4
Code: sounritesh/ciaug-naacl
Data: CoLA, GLUE, MRPC, SST, SST-2

PDF Cite Search Code Software Video