German4All – A Dataset and Model for Readability-Controlled Paraphrasing in German

Miriam Anschütz; Thanh Mai Pham; Eslam Nasrallah; Maximilian Müller; Cristian-George Craciun; Georg Groh

German4All – A Dataset and Model for Readability-Controlled Paraphrasing in German

Miriam Anschütz, Thanh Mai Pham, Eslam Nasrallah, Maximilian Müller, Cristian-George Craciun, Georg Groh

Abstract

The ability to paraphrase texts across different complexity levels is essential for creating accessible texts that can be tailored toward diverse reader groups. Thus, we introduce German4All, the first large-scale German dataset of aligned readability-controlled, paragraph-level paraphrases. It spans five readability levels and comprises over 25,000 samples. The dataset is automatically synthesized using GPT-4 and rigorously evaluated through both human and LLM-based judgments. Using German4All, we train an open-source, readability-controlled paraphrasing model that achieves state-of-the-art performance in German text simplification, enabling more nuanced and reader-specific adaptations. We open-source both the dataset and the model to encourage further research on multi-level paraphrasing.

Anthology ID:: 2025.inlg-main.24
Volume:: Proceedings of the 18th International Natural Language Generation Conference
Month:: October
Year:: 2025
Address:: Hanoi, Vietnam
Editors:: Lucie Flek, Shashi Narayan, Lê Hồng Phương, Jiahuan Pei
Venue:: INLG
SIG:: SIGGEN
Publisher:: Association for Computational Linguistics
Note:
Pages:: 390–407
Language:
URL:: https://aclanthology.org/2025.inlg-main.24/
DOI:
Bibkey:
Cite (ACL):: Miriam Anschütz, Thanh Mai Pham, Eslam Nasrallah, Maximilian Müller, Cristian-George Craciun, and Georg Groh. 2025. German4All – A Dataset and Model for Readability-Controlled Paraphrasing in German. In Proceedings of the 18th International Natural Language Generation Conference, pages 390–407, Hanoi, Vietnam. Association for Computational Linguistics.
Cite (Informal):: German4All – A Dataset and Model for Readability-Controlled Paraphrasing in German (Anschütz et al., INLG 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.inlg-main.24.pdf

PDF Cite Search Fix data