Data Augmentation for Text Generation Without Any Augmented Data

Wei Bi, Huayang Li, Jiacheng Huang


Abstract
Data augmentation is an effective way to improve the performance of many neural text generation models. However, current data augmentation methods need to define or choose proper data mapping functions that map the original samples into the augmented samples. In this work, we derive an objective to formulate the problem of data augmentation on text generation tasks without any use of augmented data constructed by specific mapping functions. Our proposed objective can be efficiently optimized and applied to popular loss functions on text generation tasks with a convergence rate guarantee. Experiments on five datasets of two text generation tasks show that our approach can approximate or even surpass popular data augmentation methods.
Anthology ID:
2021.acl-long.173
Volume:
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Month:
August
Year:
2021
Address:
Online
Venues:
ACL | IJCNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2223–2237
Language:
URL:
https://aclanthology.org/2021.acl-long.173
DOI:
10.18653/v1/2021.acl-long.173
Bibkey:
Cite (ACL):
Wei Bi, Huayang Li, and Jiacheng Huang. 2021. Data Augmentation for Text Generation Without Any Augmented Data. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 2223–2237, Online. Association for Computational Linguistics.
Cite (Informal):
Data Augmentation for Text Generation Without Any Augmented Data (Bi et al., ACL-IJCNLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.acl-long.173.pdf
Video:
 https://aclanthology.org/2021.acl-long.173.mp4