Dynamic Sentence Boundary Detection for Simultaneous Translation

Ruiqing Zhang, Chuanqiang Zhang


Abstract
Simultaneous Translation is a great challenge in which translation starts before the source sentence finished. Most studies take transcription as input and focus on balancing translation quality and latency for each sentence. However, most ASR systems can not provide accurate sentence boundaries in realtime. Thus it is a key problem to segment sentences for the word streaming before translation. In this paper, we propose a novel method for sentence boundary detection that takes it as a multi-class classification task under the end-to-end pre-training framework. Experiments show significant improvements both in terms of translation quality and latency.
Anthology ID:
2020.autosimtrans-1.1
Volume:
Proceedings of the First Workshop on Automatic Simultaneous Translation
Month:
July
Year:
2020
Address:
Seattle, Washington
Editors:
Hua Wu, Colin Cherry, Liang Huang, Zhongjun He, Mark Liberman, James Cross, Yang Liu
Venue:
AutoSimTrans
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–9
Language:
URL:
https://aclanthology.org/2020.autosimtrans-1.1
DOI:
10.18653/v1/2020.autosimtrans-1.1
Bibkey:
Cite (ACL):
Ruiqing Zhang and Chuanqiang Zhang. 2020. Dynamic Sentence Boundary Detection for Simultaneous Translation. In Proceedings of the First Workshop on Automatic Simultaneous Translation, pages 1–9, Seattle, Washington. Association for Computational Linguistics.
Cite (Informal):
Dynamic Sentence Boundary Detection for Simultaneous Translation (Zhang & Zhang, AutoSimTrans 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.autosimtrans-1.1.pdf
Video:
 http://slideslive.com/38929917