Unpacking Tokenization: Evaluating Text Compression and its Correlation with Model Performance Omer Goldman author Avi Caciularu author Matan Eyal author Kris Cao author Idan Szpektor author Reut Tsarfaty author 2024-08 text Findings of the Association for Computational Linguistics: ACL 2024 Lun-Wei Ku editor Andre Martins editor Vivek Srikumar editor Association for Computational Linguistics Bangkok, Thailand conference publication goldman-etal-2024-unpacking 10.18653/v1/2024.findings-acl.134 https://aclanthology.org/2024.findings-acl.134/ 2024-08 2274 2286