Exploring German Multi-Level Text Simplification

Nicolas Spring, Annette Rios, Sarah Ebling


Abstract
We report on experiments in automatic text simplification (ATS) for German with multiple simplification levels along the Common European Framework of Reference for Languages (CEFR), simplifying standard German into levels A1, A2 and B1. For that purpose, we investigate the use of source labels and pretraining on standard German, allowing us to simplify standard language to a specific CEFR level. We show that these approaches are especially effective in low-resource scenarios, where we are able to outperform a standard transformer baseline. Moreover, we introduce copy labels, which we show can help the model make a distinction between sentences that require further modifications and sentences that can be copied as-is.
Anthology ID:
2021.ranlp-1.150
Volume:
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)
Month:
September
Year:
2021
Address:
Held Online
Editors:
Ruslan Mitkov, Galia Angelova
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd.
Note:
Pages:
1339–1349
Language:
URL:
https://aclanthology.org/2021.ranlp-1.150
DOI:
Bibkey:
Cite (ACL):
Nicolas Spring, Annette Rios, and Sarah Ebling. 2021. Exploring German Multi-Level Text Simplification. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), pages 1339–1349, Held Online. INCOMA Ltd..
Cite (Informal):
Exploring German Multi-Level Text Simplification (Spring et al., RANLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.ranlp-1.150.pdf
Code
 zurichnlp/ranlp2021-german-ats
Data
Newsela