HW-TSC’s submission to the IWSLT 2024 Subtitling track

Yuhao Xie, Yuanchang Luo, Zongyao Li, Zhanglin Wu, Xiaoyu Chen, Zhiqiang Rao, Shaojun Li, Hengchao Shang, Jiaxin Guo, Daimeng Wei, Hao Yang


Abstract
This paper introduces HW-TSC’s submission to the IWSLT 2024 Subtitling track. For the automatic subtitling track, we use an unconstrained cascaded strategy, with the main steps being: ASR with word-level timestamps, sentence segmentation based on punctuation restoration, further alignment using CTC or using machine translation with length penalty. For the subtitle compression track, we employ a subtitle compression strategy that integrates machine translation models and extensive rewriting models. We acquire the subtitle text requiring revision through the CPS index, then utilize a translation model to obtain the English version of this text. Following this, we extract the compressed-length subtitle text through controlled decoding. If this method fails to compress the text successfully, we resort to the Llama2 few-shot model for further compression.
Anthology ID:
2024.iwslt-1.34
Volume:
Proceedings of the 21st International Conference on Spoken Language Translation (IWSLT 2024)
Month:
August
Year:
2024
Address:
Bangkok, Thailand (in-person and online)
Editors:
Elizabeth Salesky, Marcello Federico, Marine Carpuat
Venue:
IWSLT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
286–290
Language:
URL:
https://aclanthology.org/2024.iwslt-1.34
DOI:
Bibkey:
Cite (ACL):
Yuhao Xie, Yuanchang Luo, Zongyao Li, Zhanglin Wu, Xiaoyu Chen, Zhiqiang Rao, Shaojun Li, Hengchao Shang, Jiaxin Guo, Daimeng Wei, and Hao Yang. 2024. HW-TSC’s submission to the IWSLT 2024 Subtitling track. In Proceedings of the 21st International Conference on Spoken Language Translation (IWSLT 2024), pages 286–290, Bangkok, Thailand (in-person and online). Association for Computational Linguistics.
Cite (Informal):
HW-TSC’s submission to the IWSLT 2024 Subtitling track (Xie et al., IWSLT 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.iwslt-1.34.pdf