GOVERN: Gradient Orientation Vote Ensemble for Multi-Teacher Reinforced Distillation

Wenjie Zhou; Zhenxin Ding; Xiaodong Zhang; Haibo Shi; Junfeng Wang; Dawei Yin

doi:10.18653/v1/2024.emnlp-industry.120

GOVERN: Gradient Orientation Vote Ensemble for Multi-Teacher Reinforced Distillation

Wenjie Zhou, Zhenxin Ding, Xiaodong Zhang, Haibo Shi, Junfeng Wang, Dawei Yin

Abstract

Pre-trained language models have become an integral component of question-answering systems, achieving remarkable performance. However, for practical deployment, it is crucial to perform knowledge distillation to maintain high performance while operating under computational constraints. In this paper, we address a key question: given the importance of unsupervised distillation for student model performance, how can knowledge from multiple teacher models be effectively ensemble during this stage without the guidance of labels? We propose a novel algorithm, GOVERN, to tackle this issue. GOVERN has demonstrated significant improvements in both offline and online experiments, enabling the student model to achieve results comparable to that of teacher ensembles. Our experiments show that GOVERN remarkably requires a mere 1% of the ensemble method’s inference budget to achieve 99.5% of performance. The proposed algorithm has been successfully deployed in a real-world commercial question-answering system, demonstrating its real-world applicability.

Anthology ID:: 2024.emnlp-industry.120
Volume:: Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track
Month:: November
Year:: 2024
Address:: Miami, Florida, US
Editors:: Franck Dernoncourt, Daniel Preoţiuc-Pietro, Anastasia Shimorina
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1650–1658
Language:
URL:: https://aclanthology.org/2024.emnlp-industry.120/
DOI:: 10.18653/v1/2024.emnlp-industry.120
Bibkey:
Cite (ACL):: Wenjie Zhou, Zhenxin Ding, Xiaodong Zhang, Haibo Shi, Junfeng Wang, and Dawei Yin. 2024. GOVERN: Gradient Orientation Vote Ensemble for Multi-Teacher Reinforced Distillation. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track, pages 1650–1658, Miami, Florida, US. Association for Computational Linguistics.
Cite (Informal):: GOVERN: Gradient Orientation Vote Ensemble for Multi-Teacher Reinforced Distillation (Zhou et al., EMNLP 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.emnlp-industry.120.pdf

PDF Cite Search Fix data