Contextualizing Variation in Text Style Transfer Datasets

Stephanie Schoch, Wanyu Du, Yangfeng Ji


Abstract
Text style transfer involves rewriting the content of a source sentence in a target style. Despite there being a number of style tasks with available data, there has been limited systematic discussion of how text style datasets relate to each other. This understanding, however, is likely to have implications for selecting multiple data sources for model training. While it is prudent to consider inherent stylistic properties when determining these relationships, we also must consider how a style is realized in a particular dataset. In this paper, we conduct several empirical analyses of existing text style datasets. Based on our results, we propose a categorization of stylistic and dataset properties to consider when utilizing or comparing text style datasets.
Anthology ID:
2021.inlg-1.22
Volume:
Proceedings of the 14th International Conference on Natural Language Generation
Month:
August
Year:
2021
Address:
Aberdeen, Scotland, UK
Editors:
Anya Belz, Angela Fan, Ehud Reiter, Yaji Sripada
Venue:
INLG
SIG:
SIGGEN
Publisher:
Association for Computational Linguistics
Note:
Pages:
226–239
Language:
URL:
https://aclanthology.org/2021.inlg-1.22
DOI:
10.18653/v1/2021.inlg-1.22
Bibkey:
Cite (ACL):
Stephanie Schoch, Wanyu Du, and Yangfeng Ji. 2021. Contextualizing Variation in Text Style Transfer Datasets. In Proceedings of the 14th International Conference on Natural Language Generation, pages 226–239, Aberdeen, Scotland, UK. Association for Computational Linguistics.
Cite (Informal):
Contextualizing Variation in Text Style Transfer Datasets (Schoch et al., INLG 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.inlg-1.22.pdf
Data
GYAFCPenn Treebank