DMix: Distance Constrained Interpolative Mixup

Ramit Sawhney, Megh Thakkar, Shrey Pandit, Debdoot Mukherjee, Lucie Flek


Abstract
Interpolation-based regularisation methods have proven to be effective for various tasks and modalities. Mixup is a data augmentation method that generates virtual training samples from convex combinations of individual inputs and labels. We extend Mixup and propose DMix, distance-constrained interpolative Mixup for sentence classification leveraging the hyperbolic space. DMix achieves state-of-the-art results on sentence classification over existing data augmentation methods across datasets in four languages.
Anthology ID:
2021.mrl-1.21
Original:
2021.mrl-1.21v1
Version 2:
2021.mrl-1.21v2
Volume:
Proceedings of the 1st Workshop on Multilingual Representation Learning
Month:
November
Year:
2021
Address:
Punta Cana, Dominican Republic
Editors:
Duygu Ataman, Alexandra Birch, Alexis Conneau, Orhan Firat, Sebastian Ruder, Gozde Gul Sahin
Venue:
MRL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
242–244
Language:
URL:
https://aclanthology.org/2021.mrl-1.21
DOI:
10.18653/v1/2021.mrl-1.21
PDF:
https://aclanthology.org/2021.mrl-1.21.pdf
Video:
 https://aclanthology.org/2021.mrl-1.21.mp4