Do Language Models Care about Text Quality? Evaluating Web-Crawled Corpora across 11 Languages Rik van Noord author Taja Kuzman author Peter Rupnik author Nikola Ljubešić author Miquel Esplà-Gomis author Gema Ramírez-Sánchez author Antonio Toral author 2024-05 text Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) Nicoletta Calzolari editor Min-Yen Kan editor Veronique Hoste editor Alessandro Lenci editor Sakriani Sakti editor Nianwen Xue editor ELRA and ICCL Torino, Italia conference publication van-noord-etal-2024-language https://aclanthology.org/2024.lrec-main.465/ 2024-05 5221 5234