Hypoformer: Hybrid Decomposition Transformer for Edge-friendly Neural Machine Translation

Sunzhu Li, Peng Zhang, Guobing Gan, Xiuqing Lv, Benyou Wang, Junqiu Wei, Xin Jiang


Abstract
Transformer has been demonstrated effective in Neural Machine Translation (NMT). However, it is memory-consuming and time-consuming in edge devices, resulting in some difficulties for real-time feedback. To compress and accelerate Transformer, we propose a Hybrid Tensor-Train (HTT) decomposition, which retains full rank and meanwhile reduces operations and parameters. A Transformer using HTT, named Hypoformer, consistently and notably outperforms the recent light-weight SOTA methods on three standard translation tasks under different parameter and speed scales. In extreme low resource scenarios, Hypoformer has 7.1 points absolute improvement in BLEU and 1.27 X speedup than vanilla Transformer on IWSLT’14 De-En task.
Anthology ID:
2022.emnlp-main.475
Volume:
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Editors:
Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7056–7068
Language:
URL:
https://aclanthology.org/2022.emnlp-main.475
DOI:
10.18653/v1/2022.emnlp-main.475
Bibkey:
Cite (ACL):
Sunzhu Li, Peng Zhang, Guobing Gan, Xiuqing Lv, Benyou Wang, Junqiu Wei, and Xin Jiang. 2022. Hypoformer: Hybrid Decomposition Transformer for Edge-friendly Neural Machine Translation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 7056–7068, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
Hypoformer: Hybrid Decomposition Transformer for Edge-friendly Neural Machine Translation (Li et al., EMNLP 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.emnlp-main.475.pdf