Jamo Pair Encoding: Subcharacter Representation-based Extreme Korean Vocabulary Compression for Efficient Subword Tokenization Sangwhan Moon author Naoaki Okazaki author 2020-05 text eng Proceedings of the Twelfth Language Resources and Evaluation Conference Nicoletta Calzolari editor Frédéric Béchet editor Philippe Blache editor Khalid Choukri editor Christopher Cieri editor Thierry Declerck editor Sara Goggi editor Hitoshi Isahara editor Bente Maegaard editor Joseph Mariani editor Hélène Mazo editor Asuncion Moreno editor Jan Odijk editor Stelios Piperidis editor European Language Resources Association Marseille, France conference publication 979-10-95546-34-4 moon-okazaki-2020-jamo https://aclanthology.org/2020.lrec-1.429/ 2020-05 3490 3497