MLLP-VRAIN UPV system for the IWSLT 2025 Simultaneous Speech Translation Translation task

Jorge Iranzo-Sánchez; Javier Iranzo-Sánchez; Adrià Giménez Pastor; Jorge Civera Saiz; Alfons Juan

doi:10.18653/v1/2025.iwslt-1.35

MLLP-VRAIN UPV system for the IWSLT 2025 Simultaneous Speech Translation Translation task

Jorge Iranzo-Sánchez, Javier Iranzo-Sanchez, Adrià Giménez Pastor, Jorge Civera Saiz, Alfons Juan

Abstract

This work describes the participation of the MLLP-VRAIN research group in the shared task of the IWSLT 2025 Simultaneous Speech Translation track. Our submission addresses the unique challenges of real-time translation of long-form speech by developing a modular cascade system that adapts strong pre-trained models to streaming scenarios. We combine Whisper Large-V3-Turbo for ASR with the multilingual NLLB-3.3B model for MT, implementing lightweight adaptation techniques rather than training new end-to-end models from scratch. Our approach employs document-level adaptation with prefix training to enhance the MT model’s ability to handle incomplete inputs, while incorporating adaptive emission policies including a wait-k strategy and RALCP for managing the translation stream. Specialized buffer management techniques and segmentation strategies ensure coherent translations across long audio sequences. Experimental results on the ACL60/60 dataset demonstrate that our system achieves a favorable balance between translation quality and latency, with a BLEU score of 31.96 and non-computational-aware StreamLAAL latency of 2.94 seconds. Our final model achieves a preliminary score on the official test set (IWSLT25Instruct) of 29.8 BLEU. Our work demonstrates that carefully adapted pre-trained components can create effective simultaneous translation systems for long-form content without requiring extensive in-domain parallel data or specialized end-to-end training.

Anthology ID:: 2025.iwslt-1.35
Volume:: Proceedings of the 22nd International Conference on Spoken Language Translation (IWSLT 2025)
Month:: July
Year:: 2025
Address:: Vienna, Austria (in-person and online)
Editors:: Elizabeth Salesky, Marcello Federico, Antonis Anastasopoulos
Venues:: IWSLT | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 340–346
Language:
URL:: https://aclanthology.org/2025.iwslt-1.35/
DOI:: 10.18653/v1/2025.iwslt-1.35
Bibkey:
Cite (ACL):: Jorge Iranzo-Sánchez, Javier Iranzo-Sanchez, Adrià Giménez Pastor, Jorge Civera Saiz, and Alfons Juan. 2025. MLLP-VRAIN UPV system for the IWSLT 2025 Simultaneous Speech Translation Translation task. In Proceedings of the 22nd International Conference on Spoken Language Translation (IWSLT 2025), pages 340–346, Vienna, Austria (in-person and online). Association for Computational Linguistics.
Cite (Informal):: MLLP-VRAIN UPV system for the IWSLT 2025 Simultaneous Speech Translation Translation task (Iranzo-Sánchez et al., IWSLT 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.iwslt-1.35.pdf

PDF Cite Search Fix data