%0 Conference Proceedings
%T RETURNN as a Generic Flexible Neural Toolkit with Application to Translation and Speech Recognition
%A Zeyer, Albert
%A Alkhouli, Tamer
%A Ney, Hermann
%Y Liu, Fei
%Y Solorio, Thamar
%S Proceedings of ACL 2018, System Demonstrations
%D 2018
%8 July
%I Association for Computational Linguistics
%C Melbourne, Australia
%F zeyer-etal-2018-returnn
%X We compare the fast training and decoding speed of RETURNN of attention models for translation, due to fast CUDA LSTM kernels, and a fast pure TensorFlow beam search decoder. We show that a layer-wise pretraining scheme for recurrent attention models gives over 1% BLEU improvement absolute and it allows to train deeper recurrent encoder networks. Promising preliminary results on max. expected BLEU training are presented. We are able to train state-of-the-art models for translation and end-to-end models for speech recognition and show results on WMT 2017 and Switchboard. The flexibility of RETURNN allows a fast research feedback loop to experiment with alternative architectures, and its generality allows to use it on a wide range of applications.
%R 10.18653/v1/P18-4022
%U https://aclanthology.org/P18-4022
%U https://doi.org/10.18653/v1/P18-4022
%P 128-133