Multi-teacher Distillation for Multilingual Spelling Correction

Jingfen Zhang; Xuan Guo; Sravan Bodapati; Christopher Potts

doi:10.18653/v1/2023.emnlp-industry.15

Multi-teacher Distillation for Multilingual Spelling Correction

Jingfen Zhang, Xuan Guo, Sravan Bodapati, Christopher Potts

Abstract

Accurate spelling correction is a critical step in modern search interfaces, especially in an era of mobile devices and speech-to-text interfaces. For services that are deployed around the world, this poses a significant challenge for multilingual NLP: spelling errors need to be caught and corrected in all languages, and even in queries that use multiple languages. In this paper, we tackle this challenge using multi-teacher distillation. On our approach, a monolingual teacher model is trained for each language/locale, and these individual models are distilled into a single multilingual student model intended to serve all languages/locales. In experiments using open-source data as well as customer data from a worldwide search service, we show that this leads to highly effective spelling correction models that can meet the tight latency requirements of deployed services.

Anthology ID:: 2023.emnlp-industry.15
Volume:: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track
Month:: December
Year:: 2023
Address:: Singapore
Editors:: Mingxuan Wang, Imed Zitouni
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 142–151
Language:
URL:: https://aclanthology.org/2023.emnlp-industry.15/
DOI:: 10.18653/v1/2023.emnlp-industry.15
Bibkey:
Cite (ACL):: Jingfen Zhang, Xuan Guo, Sravan Bodapati, and Christopher Potts. 2023. Multi-teacher Distillation for Multilingual Spelling Correction. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track, pages 142–151, Singapore. Association for Computational Linguistics.
Cite (Informal):: Multi-teacher Distillation for Multilingual Spelling Correction (Zhang et al., EMNLP 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.emnlp-industry.15.pdf
Video:: https://aclanthology.org/2023.emnlp-industry.15.mp4

PDF Cite Search Video Fix data