Sub-Character Tokenization for Chinese Pretrained Language Models Chenglei Si author Zhengyan Zhang author Yingfa Chen author Fanchao Qi author Xiaozhi Wang author Zhiyuan Liu author Yasheng Wang author Qun Liu author Maosong Sun author 2023 text journal article Transactions of the Association for Computational Linguistics continuing MIT Press Cambridge, MA periodical academic journal si-etal-2023-sub 10.1162/tacl_a_00560 https://aclanthology.org/2023.tacl-1.28/ 2023 11 469 487