CTC Regularization for Low-Resource Speech-to-Text Translation

Zachary William Hopton, Rico Sennrich


Abstract
The challenges of building speech-to-text translation (ST) systems (e.g., a relative lack of parallel speech–text data and robustness to noise in audio) are exacerbated for low-resource language pairs. In this work, we seek to improve low-resource ST by building on previous studies that regularize ST training with the connectionist temporal classification (CTC) loss. By systematically evaluating a diverse range of linguistic annotations as CTC labels across multiple auxiliary loss configurations, we improve speech translation systems for both low- and high-resource settings. These improvements over both a standard end-to-end ST system and a speech LLM indicate a need for continued research on regularizing speech representations in ST.
Anthology ID:
2026.loresmt-1.15
Volume:
Proceedings for the Ninth Workshop on Technologies for Machine Translation of Low Resource Languages (LoResMT 2026)
Month:
March
Year:
2026
Address:
Rabat, Morocco
Editors:
Atul Kr. Ojha, Chao-hong Liu, Ekaterina Vylomova, Flammie Pirinen, Jonathan Washington, Nathaniel Oco, Xiaobing Zhao
Venues:
LoResMT | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
186–197
Language:
URL:
https://aclanthology.org/2026.loresmt-1.15/
DOI:
Bibkey:
Cite (ACL):
Zachary William Hopton and Rico Sennrich. 2026. CTC Regularization for Low-Resource Speech-to-Text Translation. In Proceedings for the Ninth Workshop on Technologies for Machine Translation of Low Resource Languages (LoResMT 2026), pages 186–197, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):
CTC Regularization for Low-Resource Speech-to-Text Translation (Hopton & Sennrich, LoResMT 2026)
Copy Citation:
PDF:
https://aclanthology.org/2026.loresmt-1.15.pdf