Neural Search Space in Gboard Decoder

Yanxiang Zhang; Yuanbo Zhang; Haicheng Sun; Yun Wang; Gary Sivek; Shumin Zhai

doi:10.18653/v1/2024.emnlp-industry.93

Neural Search Space in Gboard Decoder

Yanxiang Zhang, Yuanbo Zhang, Haicheng Sun, Yun Wang, Gary Sivek, Shumin Zhai

Abstract

Gboard Decoder produces suggestions by looking for paths that best match input touch points on the context aware search space, which is backed by the language Finite State Transducers (FST). The language FST is currently an N-gram language model (LM). However, N-gram LMs, limited in context length, are known to have sparsity problem under device model size constraint. In this paper, we propose Neural Search Space which substitutes the N-gram LM with a Neural Network LM (NN-LM) and dynamically constructs the search space during decoding. Specifically, we integrate the long range context awareness of NN-LM into the search space by converting its outputs given context, into the language FST at runtime. This involves language FST structure redesign, pruning strategies tuning, and data structure optimizations. Online experiments demonstrate improved quality results, reducing Words Modified Ratio by [0.26%, 1.19%] on various locales with acceptable latency increases. This work opens new avenues for further improving keyboard decoding quality by enhancing neural LM more directly.

Anthology ID:: 2024.emnlp-industry.93
Volume:: Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track
Month:: November
Year:: 2024
Address:: Miami, Florida, US
Editors:: Franck Dernoncourt, Daniel Preoţiuc-Pietro, Anastasia Shimorina
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1245–1254
Language:
URL:: https://aclanthology.org/2024.emnlp-industry.93/
DOI:: 10.18653/v1/2024.emnlp-industry.93
Bibkey:
Cite (ACL):: Yanxiang Zhang, Yuanbo Zhang, Haicheng Sun, Yun Wang, Gary Sivek, and Shumin Zhai. 2024. Neural Search Space in Gboard Decoder. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track, pages 1245–1254, Miami, Florida, US. Association for Computational Linguistics.
Cite (Informal):: Neural Search Space in Gboard Decoder (Zhang et al., EMNLP 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.emnlp-industry.93.pdf

PDF Cite Search Fix data