Maximilian Awiszus


2024

pdf bib
Charles Locock, Lowcock or Lockhart? Offline Speech Translation: Test Suite for Named Entities
Maximilian Awiszus | Jan Niehues | Marco Turchi | Sebastian Stüker | Alex Waibel
Proceedings of the 21st International Conference on Spoken Language Translation (IWSLT 2024)

Generating rare words is a challenging task for natural language processing in general and in speech translation (ST) specifically. This paper introduces a test suite prepared for the Offline ST shared task at IWSLT. In the test suite, corresponding rare words (i.e. named entities) were annotated on TED-Talks for English and German and the English side was made available to the participants together with some distractors (irrelevant named entities). Our evaluation checks the capabilities of ST systems to leverage the information in the contextual list of named entities and improve translation quality. Systems are ranked based on the recall and precision of named entities (separately on person, location, and organization names) in the translated texts. Our evaluation shows that using contextual information improves translation quality as well as the recall and precision of NEs. The recall of organization names in all submissions is the lowest of all categories with a maximum of 87.5 % confirming the difficulties of ST systems in dealing with names.

2020

pdf bib
KIT’s IWSLT 2020 SLT Translation System
Ngoc-Quan Pham | Felix Schneider | Tuan-Nam Nguyen | Thanh-Le Ha | Thai Son Nguyen | Maximilian Awiszus | Sebastian Stüker | Alexander Waibel
Proceedings of the 17th International Conference on Spoken Language Translation

This paper describes KIT’s submissions to the IWSLT2020 Speech Translation evaluation campaign. We first participate in the simultaneous translation task, in which our simultaneous models are Transformer based and can be efficiently trained to obtain low latency with minimized compromise in quality. On the offline speech translation task, we applied our new Speech Transformer architecture to end-to-end speech translation. The obtained model can provide translation quality which is competitive to a complicated cascade. The latter still has the upper hand, thanks to the ability to transparently access to the transcription, and resegment the inputs to avoid fragmentation.