Teaching the Pre-trained Model to Generate Simple Texts for Text Simplification

Renliang Sun; Wei Xu; Xiaojun Wan

doi:10.18653/v1/2023.findings-acl.595

Teaching the Pre-trained Model to Generate Simple Texts for Text Simplification

Abstract

Randomly masking text spans in ordinary texts in the pre-training stage hardly allows models to acquire the ability to generate simple texts. It can hurt the performance of pre-trained models on text simplification tasks. In this paper, we propose a new continued pre-training strategy to teach the pre-trained model to generate simple texts. We continue pre-training BART, a representative model, to obtain SimpleBART. It consistently and significantly improves the results on lexical simplification, sentence simplification, and document-level simplification tasks over BART. At the end, we compare SimpleBART with several representative large language models (LLMs).

Anthology ID:: 2023.findings-acl.595
Volume:: Findings of the Association for Computational Linguistics: ACL 2023
Month:: July
Year:: 2023
Address:: Toronto, Canada
Editors:: Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 9345–9355
Language:
URL:: https://aclanthology.org/2023.findings-acl.595/
DOI:: 10.18653/v1/2023.findings-acl.595
Bibkey:
Cite (ACL):: Renliang Sun, Wei Xu, and Xiaojun Wan. 2023. Teaching the Pre-trained Model to Generate Simple Texts for Text Simplification. In Findings of the Association for Computational Linguistics: ACL 2023, pages 9345–9355, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):: Teaching the Pre-trained Model to Generate Simple Texts for Text Simplification (Sun et al., Findings 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.findings-acl.595.pdf

PDF Cite Search Fix data