Domain-specific knowledge distillation yields smaller and better models for conversational commerce

Kristen Howell; Jian Wang (王剑); Akshay Hazare; Joseph Bradley; Chris Brew; Xi Chen; Matthew Dunn; Beth Ann Hockey; Andrew Maurer; Dominic Widdows

doi:10.18653/v1/2022.ecnlp-1.18

Domain-specific knowledge distillation yields smaller and better models for conversational commerce

Kristen Howell, Jian Wang, Akshay Hazare, Joseph Bradley, Chris Brew, Xi Chen, Matthew Dunn, Beth Hockey, Andrew Maurer, Dominic Widdows

Abstract

We demonstrate that knowledge distillation can be used not only to reduce model size, but to simultaneously adapt a contextual language model to a specific domain. We use Multilingual BERT (mBERT; Devlin et al., 2019) as a starting point and follow the knowledge distillation approach of (Sahn et al., 2019) to train a smaller multilingual BERT model that is adapted to the domain at hand. We show that for in-domain tasks, the domain-specific model shows on average 2.3% improvement in F1 score, relative to a model distilled on domain-general data. Whereas much previous work with BERT has fine-tuned the encoder weights during task training, we show that the model improvements from distillation on in-domain data persist even when the encoder weights are frozen during task training, allowing a single encoder to support classifiers for multiple tasks and languages.

Anthology ID:: 2022.ecnlp-1.18
Volume:: Proceedings of the Fifth Workshop on e-Commerce and NLP (ECNLP 5)
Month:: May
Year:: 2022
Address:: Dublin, Ireland
Editors:: Shervin Malmasi, Oleg Rokhlenko, Nicola Ueffing, Ido Guy, Eugene Agichtein, Surya Kallumadi
Venue:: ECNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 151–160
Language:
URL:: https://aclanthology.org/2022.ecnlp-1.18/
DOI:: 10.18653/v1/2022.ecnlp-1.18
Bibkey:
Cite (ACL):: Kristen Howell, Jian Wang, Akshay Hazare, Joseph Bradley, Chris Brew, Xi Chen, Matthew Dunn, Beth Hockey, Andrew Maurer, and Dominic Widdows. 2022. Domain-specific knowledge distillation yields smaller and better models for conversational commerce. In Proceedings of the Fifth Workshop on e-Commerce and NLP (ECNLP 5), pages 151–160, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):: Domain-specific knowledge distillation yields smaller and better models for conversational commerce (Howell et al., ECNLP 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.ecnlp-1.18.pdf
Video:: https://aclanthology.org/2022.ecnlp-1.18.mp4

PDF Cite Search Video Fix data