Analysis of Positional Encodings for Neural Machine Translation

Jan Rosendahl, Viet Anh Khoa Tran, Weiyue Wang, Hermann Ney


Abstract
In this work we analyze and compare the behavior of the Transformer architecture when using different positional encoding methods. While absolute and relative positional encoding perform equally strong overall, we show that relative positional encoding is vastly superior (4.4% to 11.9% BLEU) when translating a sentence that is longer than any observed training sentence. We further propose and analyze variations of relative positional encoding and observe that the number of trainable parameters can be reduced without a performance loss, by using fixed encoding vectors or by removing some of the positional encoding vectors.
Anthology ID:
2019.iwslt-1.20
Volume:
Proceedings of the 16th International Conference on Spoken Language Translation
Month:
November 2-3
Year:
2019
Address:
Hong Kong
Editors:
Jan Niehues, Rolando Cattoni, Sebastian Stüker, Matteo Negri, Marco Turchi, Thanh-Le Ha, Elizabeth Salesky, Ramon Sanabria, Loic Barrault, Lucia Specia, Marcello Federico
Venue:
IWSLT
SIG:
SIGSLT
Publisher:
Association for Computational Linguistics
Note:
Pages:
Language:
URL:
https://aclanthology.org/2019.iwslt-1.20
DOI:
Bibkey:
Cite (ACL):
Jan Rosendahl, Viet Anh Khoa Tran, Weiyue Wang, and Hermann Ney. 2019. Analysis of Positional Encodings for Neural Machine Translation. In Proceedings of the 16th International Conference on Spoken Language Translation, Hong Kong. Association for Computational Linguistics.
Cite (Informal):
Analysis of Positional Encodings for Neural Machine Translation (Rosendahl et al., IWSLT 2019)
Copy Citation:
PDF:
https://aclanthology.org/2019.iwslt-1.20.pdf