Tagged End-to-End Simultaneous Speech Translation Training Using Simultaneous Interpretation Data

Yuka Ko; Ryo Fukuda; Yuta Nishikawa; Yasumasa Kano; Katsuhito Sudoh; Satoshi Nakamura

doi:10.18653/v1/2023.iwslt-1.34

Tagged End-to-End Simultaneous Speech Translation Training Using Simultaneous Interpretation Data

Yuka Ko, Ryo Fukuda, Yuta Nishikawa, Yasumasa Kano, Katsuhito Sudoh, Satoshi Nakamura

Abstract

Simultaneous speech translation (SimulST) translates partial speech inputs incrementally. Although the monotonic correspondence between input and output is preferable for smaller latency, it is not the case for distant language pairs such as English and Japanese. A prospective approach to this problem is to mimic simultaneous interpretation (SI) using SI data to train a SimulST model. However, the size of such SI data is limited, so the SI data should be used together with ordinary bilingual data whose translations are given in offline. In this paper, we propose an effective way to train a SimulST model using mixed data of SI and offline. The proposed method trains a single model using the mixed data with style tags that tell the model to generate SI- or offline-style outputs. Experiment results show improvements of BLEURT in different latency ranges, and our analyses revealed the proposed model generates SI-style outputs more than the baseline.

Anthology ID:: 2023.iwslt-1.34
Volume:: Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023)
Month:: July
Year:: 2023
Address:: Toronto, Canada (in-person and online)
Editors:: Elizabeth Salesky, Marcello Federico, Marine Carpuat
Venue:: IWSLT
SIG:: SIGSLT
Publisher:: Association for Computational Linguistics
Note:
Pages:: 363–375
Language:
URL:: https://aclanthology.org/2023.iwslt-1.34/
DOI:: 10.18653/v1/2023.iwslt-1.34
Bibkey:
Cite (ACL):: Yuka Ko, Ryo Fukuda, Yuta Nishikawa, Yasumasa Kano, Katsuhito Sudoh, and Satoshi Nakamura. 2023. Tagged End-to-End Simultaneous Speech Translation Training Using Simultaneous Interpretation Data. In Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023), pages 363–375, Toronto, Canada (in-person and online). Association for Computational Linguistics.
Cite (Informal):: Tagged End-to-End Simultaneous Speech Translation Training Using Simultaneous Interpretation Data (Ko et al., IWSLT 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.iwslt-1.34.pdf

PDF Cite Search Fix data