Parallel Data Augmentation for Formality Style Transfer

Yi Zhang, Tao Ge, Xu Sun


Abstract
The main barrier to progress in the task of Formality Style Transfer is the inadequacy of training data. In this paper, we study how to augment parallel data and propose novel and simple data augmentation methods for this task to obtain useful sentence pairs with easily accessible models and systems. Experiments demonstrate that our augmented parallel data largely helps improve formality style transfer when it is used to pre-train the model, leading to the state-of-the-art results in the GYAFC benchmark dataset.
Anthology ID:
2020.acl-main.294
Volume:
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2020
Address:
Online
Editors:
Dan Jurafsky, Joyce Chai, Natalie Schluter, Joel Tetreault
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3221–3228
Language:
URL:
https://aclanthology.org/2020.acl-main.294
DOI:
10.18653/v1/2020.acl-main.294
Bibkey:
Cite (ACL):
Yi Zhang, Tao Ge, and Xu Sun. 2020. Parallel Data Augmentation for Formality Style Transfer. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 3221–3228, Online. Association for Computational Linguistics.
Cite (Informal):
Parallel Data Augmentation for Formality Style Transfer (Zhang et al., ACL 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.acl-main.294.pdf
Video:
 http://slideslive.com/38929365
Code
 lancopku/Augmented_Data_for_FST
Data
GYAFC