Gatekeeper to save COGS and improve efficiency of Text Prediction

Nidhi Tiwari, Sneha Kola, Milos Milunovic, Si-qing Chen, Marjan Slavkovski


Abstract
The text prediction (TP) workflow calls a Large Language Model (LLM), almost, after every character to get subsequent sequence of characters, till user accepts a suggestion. The confidence score of the prediction is commonly used for filtering the results to ensure that only correct predictions are shown to user. As LLMs require massive amounts of computation and storage, such an approach incurs network and high execution cost. So, we propose a Model gatekeeper (GK) to stop the LLM calls that will result in incorrect predictions at client application level itself. This way a GK can save cost of model inference and improve user experience by not showing the incorrect predictions. We demonstrate that use of a model gatekeeper saved approx 46.6% of COGS for TP, at the cost of approx 4.5% loss in character saving. Use of GK also improved the efficiency (suggestion rate) of TP model by 73%.
Anthology ID:
2023.emnlp-industry.5
Volume:
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track
Month:
December
Year:
2023
Address:
Singapore
Editors:
Mingxuan Wang, Imed Zitouni
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
46–53
Language:
URL:
https://aclanthology.org/2023.emnlp-industry.5
DOI:
10.18653/v1/2023.emnlp-industry.5
Bibkey:
Cite (ACL):
Nidhi Tiwari, Sneha Kola, Milos Milunovic, Si-qing Chen, and Marjan Slavkovski. 2023. Gatekeeper to save COGS and improve efficiency of Text Prediction. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track, pages 46–53, Singapore. Association for Computational Linguistics.
Cite (Informal):
Gatekeeper to save COGS and improve efficiency of Text Prediction (Tiwari et al., EMNLP 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.emnlp-industry.5.pdf
Video:
 https://aclanthology.org/2023.emnlp-industry.5.mp4