AbstractAutomatic readability assessment (ARA) is the task of automatically assessing readability with little or no human supervision. ARA is essential for many second language acquisition applications to reduce the workload of annotators, who are usually language teachers. Previous unsupervised approaches manually searched textual features that correlated well with readability labels, such as perplexity scores of large language models. This paper argues that, to evaluate an assessors’ performance, rank-correlation coefficients should be used instead of Pearson’s correlation coefficient (𝜌). In the experiments, we show that its performance can be easily underestimated using Pearson’s 𝜌, which is significantly affected by the linearity of the output readability scores. We also propose a lightweight unsupervised readability assessor that achieved the best performance in both the rank correlations and Pearson’s 𝜌 among all unsupervised assessors compared.