Xianzhi Yu
2022
HW-TSC’s Submission for the WMT22 Efficiency Task
Hengchao Shang
|
Ting Hu
|
Daimeng Wei
|
Zongyao Li
|
Xianzhi Yu
|
Jianfei Feng
|
Ting Zhu
|
Lizhi Lei
|
Shimin Tao
|
Hao Yang
|
Ying Qin
|
Jinlong Yang
|
Zhiqiang Rao
|
Zhengzhe Yu
Proceedings of the Seventh Conference on Machine Translation (WMT)
This paper presents the submission of Huawei Translation Services Center (HW-TSC) to WMT 2022 Efficiency Shared Task. For this year’s task, we still apply sentence-level distillation strategy to train small models with different configurations. Then, we integrate the average attention mechanism into the lightweight RNN model to pursue more efficient decoding. We tried adding a retrain step to our 8-bit and 4-bit models to achieve a balance between model size and quality. We still use Huawei Noah’s Bolt for INT8 inference and 4-bit storage. Coupled with Bolt’s support for batch inference and multi-core parallel computing, we finally submit models with different configurations to the CPU latency and throughput tracks to explore the Pareto frontiers.
Search
Co-authors
- Hengchao Shang 1
- Ting Hu 1
- Daimeng Wei 1
- Zongyao Li 1
- Jianfei Feng 1
- show all...
Venues
- wmt1