An Imitation Learning Curriculum for Text Editing with Non-Autoregressive Models

Sweta Agrawal, Marine Carpuat


Abstract
We propose a framework for training non-autoregressive sequence-to-sequence models for editing tasks, where the original input sequence is iteratively edited to produce the output. We show that the imitation learning algorithms designed to train such models for machine translation introduces mismatches between training and inference that lead to undertraining and poor generalization in editing scenarios. We address this issue with two complementary strategies: 1) a roll-in policy that exposes the model to intermediate training sequences that it is more likely to encounter during inference, 2) a curriculum that presents easy-to-learn edit operations first, gradually increasing the difficulty of training samples as the model becomes competent. We show the efficacy of these strategies on two challenging English editing tasks: controllable text simplification and abstractive summarization. Our approach significantly improves output quality on both tasks and controls output complexity better on the simplification task.
Anthology ID:
2022.acl-long.520
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7550–7563
Language:
URL:
https://aclanthology.org/2022.acl-long.520
DOI:
10.18653/v1/2022.acl-long.520
Bibkey:
Cite (ACL):
Sweta Agrawal and Marine Carpuat. 2022. An Imitation Learning Curriculum for Text Editing with Non-Autoregressive Models. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 7550–7563, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
An Imitation Learning Curriculum for Text Editing with Non-Autoregressive Models (Agrawal & Carpuat, ACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.acl-long.520.pdf
Data
Newsela