Hybrid Models for Sentence Readability Assessment

Fengkai Liu, John Lee


Abstract
Automatic readability assessment (ARA) predicts how difficult it is for the reader to understand a text. While ARA has traditionally been performed at the passage level, there has been increasing interest in ARA at the sentence level, given its applications in downstream tasks such as text simplification and language exercise generation. Recent research has suggested the effectiveness of hybrid approaches for ARA, but they have yet to be applied on the sentence level. We present the first study that compares neural and hybrid models for sentence-level ARA. We conducted experiments on graded sentences from the Wall Street Journal (WSJ) and a dataset derived from the OneStopEnglish corpus. Experimental results show that both neural and hybrid models outperform traditional classifiers trained on linguistic features. Hybrid models obtained the best accuracy on both datasets, surpassing the previous best result reported on the WSJ dataset by almost 13% absolute.
Anthology ID:
2023.bea-1.37
Volume:
Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Ekaterina Kochmar, Jill Burstein, Andrea Horbach, Ronja Laarmann-Quante, Nitin Madnani, Anaïs Tack, Victoria Yaneva, Zheng Yuan, Torsten Zesch
Venue:
BEA
SIG:
SIGEDU
Publisher:
Association for Computational Linguistics
Note:
Pages:
448–454
Language:
URL:
https://aclanthology.org/2023.bea-1.37
DOI:
10.18653/v1/2023.bea-1.37
Bibkey:
Cite (ACL):
Fengkai Liu and John Lee. 2023. Hybrid Models for Sentence Readability Assessment. In Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023), pages 448–454, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Hybrid Models for Sentence Readability Assessment (Liu & Lee, BEA 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.bea-1.37.pdf