HydraQE: OSU’s Submission for the IWSLT 2026 Speech Translation Metrics Shared Task

Kevin Krahn; Eric Fosler-Lussier

HydraQE: OSU’s Submission for the IWSLT 2026 Speech Translation Metrics Shared Task

Abstract

We present HydraQE, our contribution to the IWSLT 2026 Speech Translation Metrics shared task. HydraQE is an end-to-end, reference-free quality estimation (QE) system for speech translation built on a Qwen3-ASR backbone, which accepts source audio and a translation hypothesis as joint input. Hidden states from all backbone layers are combined via a sparsemax scalar mix, then re-encoded by a bidirectional Transformer for full cross-modal interaction. To address the scarcity of human-annotated speech translation data, three independent prediction heads are trained on complementary supervision signals: human direct assessment (DA) annotations, MetricX-24 pseudo-labels, and xCOMET pseudo-labels. We train on a combination of synthetically corrupted examples and silver pseudo-labeled machine translation outputs, using a curriculum that begins on synthetic and silver data and gradually shifts toward human-annotated examples. HydraQE outperforms cascaded text-based baselines and prior direct speech QE systems, demonstrating that end-to-end speech translation QE is competitive with cascaded approaches.

Anthology ID:: 2026.iwslt-1.37
Volume:: Proceedings of the 23rd International Conference on Spoken Language Translation (IWSLT 2026)
Month:: July
Year:: 2026
Address:: San Diego, USA (in-person and online)
Editors:: Elizabeth Salesky, Antonios Anastasopoulos, Matteo Negri, Marcello Federico
Venues:: IWSLT | WS
SIG:: SIGSLT
Publisher:: Association for Computational Linguistics
Note:
Pages:: 323–331
Language:
URL:: https://aclanthology.org/2026.iwslt-1.37/
DOI:
Bibkey:
Cite (ACL):: Kevin Krahn and Eric Fosler-Lussier. 2026. HydraQE: OSU’s Submission for the IWSLT 2026 Speech Translation Metrics Shared Task. In Proceedings of the 23rd International Conference on Spoken Language Translation (IWSLT 2026), pages 323–331, San Diego, USA (in-person and online). Association for Computational Linguistics.
Cite (Informal):: HydraQE: OSU’s Submission for the IWSLT 2026 Speech Translation Metrics Shared Task (Krahn & Fosler-Lussier, IWSLT 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.iwslt-1.37.pdf

PDF Cite Search Fix data