Diformer: Directional Transformer for Neural Machine Translation

Minghan Wang, Jiaxin Guo, Yuxia Wang, Daimeng Wei, Hengchao Shang, Yinglu Li, Chang Su, Yimeng Chen, Min Zhang, Shimin Tao, Hao Yang


Abstract
Autoregressive (AR) and Non-autoregressive (NAR) models have their own superiority on the performance and latency, combining them into one model may take advantage of both. Current combination frameworks focus more on the integration of multiple decoding paradigms with a unified generative model, e.g. Masked Language Model. However, the generalization can be harmful on the performance due to the gap between training objective and inference. In this paper, we aim to close the gap by preserving the original objective of AR and NAR under a unified framework. Specifically, we propose the Directional Transformer (Diformer) by jointly modelling AR and NAR into three generation directions (left-to-right, right-to-left and straight) with a newly introduced direction variable, which works by controlling the prediction of each token to have specific dependencies under that direction. The unification achieved by direction successfully preserves the original dependency assumption used in AR and NAR, retaining both generalization and performance. Experiments on 4 WMT benchmarks demonstrate that Diformer outperforms current united-modelling works with more than 1.5 BLEU points for both AR and NAR decoding, and is also competitive to the state-of-the-art independent AR and NAR models.
Anthology ID:
2022.eamt-1.11
Volume:
Proceedings of the 23rd Annual Conference of the European Association for Machine Translation
Month:
June
Year:
2022
Address:
Ghent, Belgium
Editors:
Helena Moniz, Lieve Macken, Andrew Rufener, Loïc Barrault, Marta R. Costa-jussà, Christophe Declercq, Maarit Koponen, Ellie Kemp, Spyridon Pilos, Mikel L. Forcada, Carolina Scarton, Joachim Van den Bogaert, Joke Daems, Arda Tezcan, Bram Vanroy, Margot Fonteyne
Venue:
EAMT
SIG:
Publisher:
European Association for Machine Translation
Note:
Pages:
81–90
Language:
URL:
https://aclanthology.org/2022.eamt-1.11
DOI:
Bibkey:
Cite (ACL):
Minghan Wang, Jiaxin Guo, Yuxia Wang, Daimeng Wei, Hengchao Shang, Yinglu Li, Chang Su, Yimeng Chen, Min Zhang, Shimin Tao, and Hao Yang. 2022. Diformer: Directional Transformer for Neural Machine Translation. In Proceedings of the 23rd Annual Conference of the European Association for Machine Translation, pages 81–90, Ghent, Belgium. European Association for Machine Translation.
Cite (Informal):
Diformer: Directional Transformer for Neural Machine Translation (Wang et al., EAMT 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.eamt-1.11.pdf