Aziz Sharipov Ortega


2026

We implement a direct speech translation model Canary for simultaneous translation with AlignAtt simultaneous policy. We focus on Nemo toolkit with the recent state-of-the-art foundation model Canary-1B-v2 that has only one billion of parameters, which is suitable for small pocket devices. This is a CUNI submission to IWSLT 2026 Simultaneous Speech Translation Shared task on Czech to English and English to German and Italian.