The Effect of Model Capacity and Script Diversity on Subword Tokenization for Sorani Kurdish Ali Salehi author Cassandra L Jacobs author 2024-06 text Proceedings of the 21st SIGMORPHON workshop on Computational Research in Phonetics, Phonology, and Morphology Garrett Nicolai editor Eleanor Chodroff editor Frederic Mailhot editor Çağrı Çöltekin editor Association for Computational Linguistics Mexico City, Mexico conference publication salehi-jacobs-2024-effect 10.18653/v1/2024.sigmorphon-1.6 https://aclanthology.org/2024.sigmorphon-1.6/ 2024-06 51 56