Enhancing Automatic Readability Assessment with Pre-training and Soft Labels for Ordinal Regression

Jinshan Zeng, Yudong Xie, Xianglong Yu, John Lee, Ding-Xuan Zhou


Abstract
The readability assessment task aims to assign a difficulty grade to a text. While neural models have recently demonstrated impressive performance, most do not exploit the ordinal nature of the difficulty grades, and make little effort for model initialization to facilitate fine-tuning. We address these limitations with soft labels for ordinal regression, and with model pre-training through prediction of pairwise relative text difficulty. We incorporate these two components into a model based on hierarchical attention networks, and evaluate its performance on both English and Chinese datasets. Experimental results show that our proposed model outperforms competitive neural models and statistical classifiers on most datasets.
Anthology ID:
2022.findings-emnlp.334
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2022
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Editors:
Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4557–4568
Language:
URL:
https://aclanthology.org/2022.findings-emnlp.334
DOI:
10.18653/v1/2022.findings-emnlp.334
Bibkey:
Cite (ACL):
Jinshan Zeng, Yudong Xie, Xianglong Yu, John Lee, and Ding-Xuan Zhou. 2022. Enhancing Automatic Readability Assessment with Pre-training and Soft Labels for Ordinal Regression. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 4557–4568, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
Enhancing Automatic Readability Assessment with Pre-training and Soft Labels for Ordinal Regression (Zeng et al., Findings 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.findings-emnlp.334.pdf