Incorporating Word-level Phonemic Decoding into Readability Assessment

Christine Pinney, Casey Kennington, Maria Soledad Pera, Katherine Landau Wright, Jerry Alan Fails


Abstract
Current approaches in automatic readability assessment have found success with the use of large language models and transformer architectures. These techniques lead to accuracy improvement, but they do not offer the interpretability that is uniquely required by the audience most often employing readability assessment tools: teachers and educators. Recent work that employs more traditional machine learning methods has highlighted the linguistic importance of considering semantic and syntactic characteristics of text in readability assessment by utilizing handcrafted feature sets. Research in Education suggests that, in addition to semantics and syntax, phonetic and orthographic instruction are necessary for children to progress through the stages of reading and spelling development; children must first learn to decode the letters and symbols on a page to recognize words and phonemes and their connection to speech sounds. Here, we incorporate this word-level phonemic decoding process into readability assessment by crafting a phonetically-based feature set for grade-level classification for English. Our resulting feature set shows comparable performance to much larger, semantically- and syntactically-based feature sets, supporting the linguistic value of orthographic and phonetic considerations in readability assessment.
Anthology ID:
2024.lrec-main.788
Volume:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:
LREC | COLING
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
8998–9009
Language:
URL:
https://aclanthology.org/2024.lrec-main.788
DOI:
Bibkey:
Cite (ACL):
Christine Pinney, Casey Kennington, Maria Soledad Pera, Katherine Landau Wright, and Jerry Alan Fails. 2024. Incorporating Word-level Phonemic Decoding into Readability Assessment. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 8998–9009, Torino, Italia. ELRA and ICCL.
Cite (Informal):
Incorporating Word-level Phonemic Decoding into Readability Assessment (Pinney et al., LREC-COLING 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.lrec-main.788.pdf