Learn Your Tokens: Word-Pooled Tokenization for Language Modeling Avijit Thawani author Saurabh Ghanekar author Xiaoyuan Zhu author Jay Pujara author 2023-12 text Findings of the Association for Computational Linguistics: EMNLP 2023 Houda Bouamor editor Juan Pino editor Kalika Bali editor Association for Computational Linguistics Singapore conference publication thawani-etal-2023-learn 10.18653/v1/2023.findings-emnlp.662 https://aclanthology.org/2023.findings-emnlp.662/ 2023-12 9883 9893