Gaussian Multi-head Attention for Simultaneous Machine Translation

Shaolei Zhang, Yang Feng


Abstract
Simultaneous machine translation (SiMT) outputs translation while receiving the streaming source inputs, and hence needs a policy to determine where to start translating. The alignment between target and source words often implies the most informative source word for each target word, and hence provides the unified control over translation quality and latency, but unfortunately the existing SiMT methods do not explicitly model the alignment to perform the control. In this paper, we propose Gaussian Multi-head Attention (GMA) to develop a new SiMT policy by modeling alignment and translation in a unified manner. For SiMT policy, GMA models the aligned source position of each target word, and accordingly waits until its aligned position to start translating. To integrate the learning of alignment into the translation model, a Gaussian distribution centered on predicted aligned position is introduced as an alignment-related prior, which cooperates with translation-related soft attention to determine the final attention. Experiments on En-Vi and De-En tasks show that our method outperforms strong baselines on the trade-off between translation and latency.
Anthology ID:
2022.findings-acl.238
Volume:
Findings of the Association for Computational Linguistics: ACL 2022
Month:
May
Year:
2022
Address:
Dublin, Ireland
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3019–3030
Language:
URL:
https://aclanthology.org/2022.findings-acl.238
DOI:
10.18653/v1/2022.findings-acl.238
Bibkey:
Cite (ACL):
Shaolei Zhang and Yang Feng. 2022. Gaussian Multi-head Attention for Simultaneous Machine Translation. In Findings of the Association for Computational Linguistics: ACL 2022, pages 3019–3030, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Gaussian Multi-head Attention for Simultaneous Machine Translation (Zhang & Feng, Findings 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.findings-acl.238.pdf
Code
 ictnlp/gma