Controllable Sentence Simplification

Louis Martin, Éric de la Clergerie, Benoît Sagot, Antoine Bordes


Abstract
Text simplification aims at making a text easier to read and understand by simplifying grammar and structure while keeping the underlying information identical. It is often considered an all-purpose generic task where the same simplification is suitable for all; however multiple audiences can benefit from simplified text in different ways. We adapt a discrete parametrization mechanism that provides explicit control on simplification systems based on Sequence-to-Sequence models. As a result, users can condition the simplifications returned by a model on attributes such as length, amount of paraphrasing, lexical complexity and syntactic complexity. We also show that carefully chosen values of these attributes allow out-of-the-box Sequence-to-Sequence models to outperform their standard counterparts on simplification benchmarks. Our model, which we call ACCESS (as shorthand for AudienCe-CEntric Sentence Simplification), establishes the state of the art at 41.87 SARI on the WikiLarge test set, a +1.42 improvement over the best previously reported score.
Anthology ID:
2020.lrec-1.577
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
4689–4698
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.577
DOI:
Bibkey:
Cite (ACL):
Louis Martin, Éric de la Clergerie, Benoît Sagot, and Antoine Bordes. 2020. Controllable Sentence Simplification. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 4689–4698, Marseille, France. European Language Resources Association.
Cite (Informal):
Controllable Sentence Simplification (Martin et al., LREC 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.lrec-1.577.pdf
Code
 facebookresearch/access +  additional community code
Data
ASSETNewselaTurkCorpusWikiLarge