Effectively pretraining a speech translation decoder with Machine Translation data

Ashkan Alinejad, Anoop Sarkar


Abstract
Directly translating from speech to text using an end-to-end approach is still challenging for many language pairs due to insufficient data. Although pretraining the encoder parameters using the Automatic Speech Recognition (ASR) task improves the results in low resource settings, attempting to use pretrained parameters from the Neural Machine Translation (NMT) task has been largely unsuccessful in previous works. In this paper, we will show that by using an adversarial regularizer, we can bring the encoder representations of the ASR and NMT tasks closer even though they are in different modalities, and how this helps us effectively use a pretrained NMT decoder for speech translation.
Anthology ID:
2020.emnlp-main.644
Volume:
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Month:
November
Year:
2020
Address:
Online
Editors:
Bonnie Webber, Trevor Cohn, Yulan He, Yang Liu
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
8014–8020
Language:
URL:
https://aclanthology.org/2020.emnlp-main.644
DOI:
10.18653/v1/2020.emnlp-main.644
Bibkey:
Cite (ACL):
Ashkan Alinejad and Anoop Sarkar. 2020. Effectively pretraining a speech translation decoder with Machine Translation data. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 8014–8020, Online. Association for Computational Linguistics.
Cite (Informal):
Effectively pretraining a speech translation decoder with Machine Translation data (Alinejad & Sarkar, EMNLP 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.emnlp-main.644.pdf
Video:
 https://slideslive.com/38939224