Exploring Conditional Variational Mechanism to Pinyin Input Method for Addressing One-to-Many Mappings in Low-Resource Scenarios

Bin Sun, Jianfeng Li, Hao Zhou, Fandong Meng, Kan Li, Jie Zhou


Abstract
Pinyin input method engine (IME) refers to the transformation tool from pinyin sequence to Chinese characters, which is widely used on mobile phone applications. Due to the homophones, Pinyin IME suffers from the one-to-many mapping problem in the process of pinyin sequences to Chinese characters. To solve the above issue, this paper makes the first exploration to leverage an effective conditional variational mechanism (CVM) for pinyin IME. However, to ensure the stable and smooth operation of Pinyin IME under low-resource conditions (e.g., on offline mobile devices), we should balance diversity, accuracy, and efficiency with CVM, which is still challenging. To this end, we employ a novel strategy that simplifies the complexity of semantic encoding by facilitating the interaction between pinyin and the Chinese character information during the construction of continuous latent variables. Concurrently, the accuracy of the outcomes is enhanced by capitalizing on the discrete latent variables. Experimental results demonstrate the superior performance of our method.
Anthology ID:
2024.acl-short.56
Volume:
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
616–629
Language:
URL:
https://aclanthology.org/2024.acl-short.56
DOI:
Bibkey:
Cite (ACL):
Bin Sun, Jianfeng Li, Hao Zhou, Fandong Meng, Kan Li, and Jie Zhou. 2024. Exploring Conditional Variational Mechanism to Pinyin Input Method for Addressing One-to-Many Mappings in Low-Resource Scenarios. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 616–629, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
Exploring Conditional Variational Mechanism to Pinyin Input Method for Addressing One-to-Many Mappings in Low-Resource Scenarios (Sun et al., ACL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.acl-short.56.pdf