Sample, Translate, Recombine: Leveraging Audio Alignments for Data Augmentation in End-to-end Speech Translation

Tsz Kin Lam, Shigehiko Schamoni, Stefan Riezler


Abstract
End-to-end speech translation relies on data that pair source-language speech inputs with corresponding translations into a target language. Such data are notoriously scarce, making synthetic data augmentation by back-translation or knowledge distillation a necessary ingredient of end-to-end training. In this paper, we present a novel approach to data augmentation that leverages audio alignments, linguistic properties, and translation. First, we augment a transcription by sampling from a suffix memory that stores text and audio data. Second, we translate the augmented transcript. Finally, we recombine concatenated audio segments and the generated translation. Our method delivers consistent improvements of up to 0.9 and 1.1 BLEU points on top of augmentation with knowledge distillation on five language pairs on CoVoST 2 and on two language pairs on Europarl-ST, respectively.
Anthology ID:
2022.acl-short.27
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
245–254
Language:
URL:
https://aclanthology.org/2022.acl-short.27
DOI:
10.18653/v1/2022.acl-short.27
Bibkey:
Cite (ACL):
Tsz Kin Lam, Shigehiko Schamoni, and Stefan Riezler. 2022. Sample, Translate, Recombine: Leveraging Audio Alignments for Data Augmentation in End-to-end Speech Translation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 245–254, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Sample, Translate, Recombine: Leveraging Audio Alignments for Data Augmentation in End-to-end Speech Translation (Lam et al., ACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.acl-short.27.pdf
Software:
 2022.acl-short.27.software.tgz
Video:
 https://aclanthology.org/2022.acl-short.27.mp4
Data
Europarl-ST