Aziz Sharipov Ortega

2026

A Pocket Offline Model for Simultaneous Speech Translation as CUNI Submission to IWSLT 2026
Aziz Sharipov Ortega | Dominik Macháček
Proceedings of the 23rd International Conference on Spoken Language Translation (IWSLT 2026)

We implement a direct speech translation model Canary for simultaneous translation with AlignAtt simultaneous policy. We focus on Nemo toolkit with the recent state-of-the-art foundation model Canary-1B-v2 that has only one billion of parameters, which is suitable for small pocket devices. This is a CUNI submission to IWSLT 2026 Simultaneous Speech Translation Shared task on Czech to English and English to German and Italian.

Co-authors

Dominik Macháček 1

Venues

IWSLT1
WS1

Fix author