Incorporating Noisy Length Constraints into Transformer with Length-aware Positional Encodings

Yui Oka, Katsuki Chousa, Katsuhito Sudoh, Satoshi Nakamura


Abstract
Neural Machine Translation often suffers from an under-translation problem due to its limited modeling of output sequence lengths. In this work, we propose a novel approach to training a Transformer model using length constraints based on length-aware positional encoding (PE). Since length constraints with exact target sentence lengths degrade translation performance, we add random noise within a certain window size to the length constraints in the PE during the training. In the inference step, we predict the output lengths using input sequences and a BERT-based length prediction model. Experimental results in an ASPEC English-to-Japanese translation showed the proposed method produced translations with lengths close to the reference ones and outperformed a vanilla Transformer (especially in short sentences) by 3.22 points in BLEU. The average translation results using our length prediction model were also better than another baseline method using input lengths for the length constraints. The proposed noise injection improved robustness for length prediction errors, especially within the window size.
Anthology ID:
2020.coling-main.319
Volume:
Proceedings of the 28th International Conference on Computational Linguistics
Month:
December
Year:
2020
Address:
Barcelona, Spain (Online)
Editors:
Donia Scott, Nuria Bel, Chengqing Zong
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
3580–3585
Language:
URL:
https://aclanthology.org/2020.coling-main.319
DOI:
10.18653/v1/2020.coling-main.319
Bibkey:
Cite (ACL):
Yui Oka, Katsuki Chousa, Katsuhito Sudoh, and Satoshi Nakamura. 2020. Incorporating Noisy Length Constraints into Transformer with Length-aware Positional Encodings. In Proceedings of the 28th International Conference on Computational Linguistics, pages 3580–3585, Barcelona, Spain (Online). International Committee on Computational Linguistics.
Cite (Informal):
Incorporating Noisy Length Constraints into Transformer with Length-aware Positional Encodings (Oka et al., COLING 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.coling-main.319.pdf
Data
ASPEC