Query Distillation: BERT-based Distillation for Ensemble Ranking

Wangshu Zhang, Junhong Liu, Zujie Wen, Yafang Wang, Gerard de Melo


Abstract
Recent years have witnessed substantial progress in the development of neural ranking networks, but also an increasingly heavy computational burden due to growing numbers of parameters and the adoption of model ensembles. Knowledge Distillation (KD) is a common solution to balance the effectiveness and efficiency. However, it is not straightforward to apply KD to ranking problems. Ranking Distillation (RD) has been proposed to address this issue, but only shows effectiveness on recommendation tasks. We present a novel two-stage distillation method for ranking problems that allows a smaller student model to be trained while benefitting from the better performance of the teacher model, providing better control of the inference latency and computational burden. We design a novel BERT-based ranking model structure for list-wise ranking to serve as our student model. All ranking candidates are fed to the BERT model simultaneously, such that the self-attention mechanism can enable joint inference to rank the document list. Our experiments confirm the advantages of our method, not just with regard to the inference latency but also in terms of higher-quality rankings compared to the original teacher model.
Anthology ID:
2020.coling-industry.4
Volume:
Proceedings of the 28th International Conference on Computational Linguistics: Industry Track
Month:
December
Year:
2020
Address:
Online
Editors:
Ann Clifton, Courtney Napoles
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
33–43
Language:
URL:
https://aclanthology.org/2020.coling-industry.4
DOI:
10.18653/v1/2020.coling-industry.4
Bibkey:
Cite (ACL):
Wangshu Zhang, Junhong Liu, Zujie Wen, Yafang Wang, and Gerard de Melo. 2020. Query Distillation: BERT-based Distillation for Ensemble Ranking. In Proceedings of the 28th International Conference on Computational Linguistics: Industry Track, pages 33–43, Online. International Committee on Computational Linguistics.
Cite (Informal):
Query Distillation: BERT-based Distillation for Ensemble Ranking (Zhang et al., COLING 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.coling-industry.4.pdf
Data
MS MARCO