Gary Sivek
2024
Neural Search Space in Gboard Decoder
Yanxiang Zhang
|
Yuanbo Zhang
|
Haicheng Sun
|
Yun Wang
|
Gary Sivek
|
Shumin Zhai
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track
Gboard Decoder produces suggestions by looking for paths that best match input touch points on the context aware search space, which is backed by the language Finite State Transducers (FST). The language FST is currently an N-gram language model (LM). However, N-gram LMs, limited in context length, are known to have sparsity problem under device model size constraint. In this paper, we propose Neural Search Space which substitutes the N-gram LM with a Neural Network LM (NN-LM) and dynamically constructs the search space during decoding. Specifically, we integrate the long range context awareness of NN-LM into the search space by converting its outputs given context, into the language FST at runtime. This involves language FST structure redesign, pruning strategies tuning, and data structure optimizations. Online experiments demonstrate improved quality results, reducing Words Modified Ratio by [0.26%, 1.19%] on various locales with acceptable latency increases. This work opens new avenues for further improving keyboard decoding quality by enhancing neural LM more directly.
Search