Assessing sentence readability for German language learners with broad linguistic modeling or readability formulas: When do linguistic insights make a difference?

Zarah Weiss, Detmar Meurers


Abstract
We present a new state-of-the-art sentence-wise readability assessment model for German L2 readers. We build a linguistically broadly informed machine learning model and compare its performance against four commonly used readability formulas. To understand when the linguistic insights used to inform our model make a difference for readability assessment and when simple readability formulas suffice, we compare their performance based on two common automatic readability assessment tasks: predictive regression and sentence pair ranking. We find that leveraging linguistic insights yields top performances across tasks, but that for the identification of simplified sentences also readability formulas – which are easier to compute and more accessible – can be sufficiently precise. Linguistically informed modeling, however, is the only viable option for high quality outcomes in fine-grained prediction tasks. We then explore the sentence-wise readability profile of leveled texts written for language learners at a beginning, intermediate, and advanced level of German to showcase the valuable insights that sentence-wise readability assessment can have for the adaptation of learning materials and better understand how sentences’ individual readability contributes to larger texts’ overall readability.
Anthology ID:
2022.bea-1.19
Volume:
Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2022)
Month:
July
Year:
2022
Address:
Seattle, Washington
Editors:
Ekaterina Kochmar, Jill Burstein, Andrea Horbach, Ronja Laarmann-Quante, Nitin Madnani, Anaïs Tack, Victoria Yaneva, Zheng Yuan, Torsten Zesch
Venue:
BEA
SIG:
SIGEDU
Publisher:
Association for Computational Linguistics
Note:
Pages:
141–153
Language:
URL:
https://aclanthology.org/2022.bea-1.19
DOI:
10.18653/v1/2022.bea-1.19
Bibkey:
Cite (ACL):
Zarah Weiss and Detmar Meurers. 2022. Assessing sentence readability for German language learners with broad linguistic modeling or readability formulas: When do linguistic insights make a difference?. In Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2022), pages 141–153, Seattle, Washington. Association for Computational Linguistics.
Cite (Informal):
Assessing sentence readability for German language learners with broad linguistic modeling or readability formulas: When do linguistic insights make a difference? (Weiss & Meurers, BEA 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.bea-1.19.pdf
Data
TextComplexityDE