Everest Liu


2019

pdf bib
Comparing and Developing Tools to Measure the Readability of Domain-Specific Texts
Elissa Redmiles | Lisa Maszkiewicz | Emily Hwang | Dhruv Kuchhal | Everest Liu | Miraida Morales | Denis Peskov | Sudha Rao | Rock Stevens | Kristina Gligorić | Sean Kross | Michelle Mazurek | Hal Daumé III
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

The readability of a digital text can influence people’s ability to learn new things about a range topics from digital resources (e.g., Wikipedia, WebMD). Readability also impacts search rankings, and is used to evaluate the performance of NLP systems. Despite this, we lack a thorough understanding of how to validly measure readability at scale, especially for domain-specific texts. In this work, we present a comparison of the validity of well-known readability measures and introduce a novel approach, Smart Cloze, which is designed to address shortcomings of existing measures. We compare these approaches across four different corpora: crowdworker-generated stories, Wikipedia articles, security and privacy advice, and health information. On these corpora, we evaluate the convergent and content validity of each measure, and detail tradeoffs in score precision, domain-specificity, and participant burden. These results provide a foundation for more accurate readability measurements and better evaluation of new natural-language-processing systems and tools.