SimulMT to SimulST: Adapting Simultaneous Text Translation to End-to-End Simultaneous Speech Translation

Xutai Ma, Juan Pino, Philipp Koehn


Abstract
We investigate how to adapt simultaneous text translation methods such as wait-k and monotonic multihead attention to end-to-end simultaneous speech translation by introducing a pre-decision module. A detailed analysis is provided on the latency-quality trade-offs of combining fixed and flexible pre-decision with fixed and flexible policies. We also design a novel computation-aware latency metric, adapted from Average Lagging.
Anthology ID:
2020.aacl-main.58
Volume:
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing
Month:
December
Year:
2020
Address:
Suzhou, China
Editors:
Kam-Fai Wong, Kevin Knight, Hua Wu
Venue:
AACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
582–587
Language:
URL:
https://aclanthology.org/2020.aacl-main.58
DOI:
Bibkey:
Cite (ACL):
Xutai Ma, Juan Pino, and Philipp Koehn. 2020. SimulMT to SimulST: Adapting Simultaneous Text Translation to End-to-End Simultaneous Speech Translation. In Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing, pages 582–587, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
SimulMT to SimulST: Adapting Simultaneous Text Translation to End-to-End Simultaneous Speech Translation (Ma et al., AACL 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.aacl-main.58.pdf
Code
 pytorch/fairseq
Data
MuST-C